U.S. patent application number 15/736296 was filed with the patent office on 2018-07-05 for systems and methods for obtaining biological molecules from a sample. The applicant listed for this patent is THE TRANSLATIONAL GENOMICS RESEARCH INSTITUTE. Invention is credited to Matthew Huentelman.
Application Number | 20180187260 15/736296 |
Document ID | / |
Family ID | 57546528 |
Filed Date | 2018-07-05 |
United States PatentApplication | 20180187260 |
Kind Code | A1 |
Huentelman; Matthew | July 5, 2018 |
SYSTEMS AND METHODS FOR OBTAINING BIOLOGICAL MOLECULES FROM ASAMPLE
Abstract
The present invention relates to a method of creating abiomarker profile, the method comprising the steps of: obtaining asample of biofluid from a subject, wherein the sample is stored ona sample collection apparatus; removing the sample from the samplecollection apparatus; extracting nucleic acids from the sample;sequencing the extracted nucleic acids to generate sequence data;and analyzing the sequence data using a two-step analyticalmethodology to create the biomarker profile. The present inventionathletic performance in a subject.
Inventors: | Huentelman; Matthew;(Phoenix, AZ) | ||||||||||
Applicant: |
| ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Family ID: | 57546528 | ||||||||||
Appl. No.: | 15/736296 | ||||||||||
Filed: | June 17, 2016 | ||||||||||
PCT Filed: | June 17, 2016 | ||||||||||
PCT NO: | PCT/US2016/038243 | ||||||||||
371 Date: | December 13, 2017 |
Related U.S. Patent Documents
ApplicationNumber | Filing Date | Patent Number | ||
---|---|---|---|---|
62181041 | Jun 17, 2015 | |||
Current U.S.Class: | 1/1 |
Current CPCClass: | C12Q 2600/118 20130101;C12Q 2600/166 20130101; C12Q 2600/16 20130101; C12Q 1/687920130101; C12Q 2600/124 20130101; C12Q 2600/158 20130101; C12Q1/6876 20130101; C12Q 1/6869 20130101; G01N 2800/52 20130101; C12Q1/6883 20130101; C12Q 1/6809 20130101; G01N 2800/2807 20130101 |
InternationalClass: | C12Q 1/6883 20060101C12Q001/6883; C12Q 1/6869 20060101 C12Q001/6869; C12Q 1/680920060101 C12Q001/6809 |
Claims
1. A method of creating a biomarker profile, the method comprisingthe steps of: obtaining a sample of biofluid from a subject;extracting nucleic acids from the sample; sequencing the extractednucleic acids to generate sequence data; and analyzing the sequencedata using an analytical methodology to create the biomarkerprofile.
2. The method of claim 1, wherein the biofluid is selected from thegroup consisting of one or more drops of blood, plasma, serum,urine, sputum, cerebrospinal fluid, milk, and ductal fluid.
3. (canceled)
4. The method of claim 1, wherein the sample is placed on cellulosepaper to dry.
5. The method of claim 4, wherein the cellulose paper has not beentreated with any chemical stabilizers of nucleic acids.
6. The method of claim 1, wherein the nucleic acids areintracellular RNA.
7. (canceled)
8. The method of claim 6, wherein the analytical methodologycomprises: a) determining: a coefficient of variance of the RNAtranscript from the sample; and a coefficient of variance of an RNAtranscript from a reference; and b) removing the sample RNAtranscript from the biomarker profile if the sample coefficient isgreater than the reference coefficient.
9. The method of claim 8, wherein the reference sample is notallowed to dry prior to extracting nucleic acids.
10. The method of claim 1, wherein the sample is obtained from thesubject through a non-invasive methodology and the nucleic acidsare extracellular RNA.
11. (canceled)
12. A method of determining the sex of an in utero fetus, themethod comprising the steps of: obtaining a sample of biofluid froma pregnant mother; extracting nucleic acids from the sample;sequencing the extracted nucleic acids to generate sequence data;and analyzing the sequence data to determine the sex of the inutero fetus, wherein the in utero fetus is male if expression of Ychromosome nucleic acids is similar to or greater than expressionof X chromosome nucleic acids in the sample.
13. The method of claim 12, wherein the sample comprises a singledrop of blood.
14. The method of claim 12, wherein the nucleic acids areextracellular nucleic acids.
15. A method of predicting onset of a migraine in a subject, themethod comprising the steps of: obtaining a set of samples ofbiofluid from the subject over time intervals; extracting nucleicacids from the set of samples; sequencing the extracted nucleicacids to generate sequence data; and analyzing the sequence data toidentify sudden increases in gene expression of ATP bindingcassette subfamily C member 1 (ABCC1) and/or syntaxin bindingprotein 3 (STXBP3), wherein a sudden increase in expression ofABCC1 and/or STXBP3 indicates onset of a migraine in thesubject.
16. The method of claim 15, wherein a sudden increase in expressionis an increase of at least 20 times in a 12-hour period.
17. The method of claim 15, wherein the time intervals are 4 hours,6 hours, 12 hours, or 24 hours.
18. The method of claim 15, wherein the set of samples is a set ofsingle drops of blood allowed to dry.
19. The method of claim 15, further comprising treating the subjectfor migraine, wherein treating the migraine comprises administeringto the subject an effective amount of a non-steroidalanti-inflammatory drug (NSAID), a triptan, an ergotamine,metoclopramide, lidocaine, or a combination thereof.
20. A method of tracking athletic performance in a subject, themethod comprising the steps of: obtaining a set of samples ofbiofluid from the subject collected before, during, and afteraerobic exercise; extracting nucleic acids from the set of samples;sequencing the extracted nucleic acids to generate sequence data;and analyzing the sequence data to identify increases in geneexpression of dysferlin (DYSF) and/or matrix metallopeptidase 9(MMP9), wherein an increase in expression of DYSF and/or MMP9compared to a reference indicates improved athletic performance inthe subject.
21. The method of claim 20, wherein the reference is a measurementof expression of DYSF and/or MMP9 in a set of samples from thesubject determined from an earlier time point in the athletictraining of the subject.
22. The method of claim 20, wherein the set of samples is a set ofsingle drops of blood allowed to dry.
23. The method of claim 20, wherein improved athletic performanceis indicated by increased endurance, greater muscle strength, or acombination thereof.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. ProvisionalApplication No. 62/181,041 filed Jun. 17, 2015, the contents ofwhich are incorporated herein by reference in their entirety.
TECHNICAL FIELD
[0002] This application generally relates to systems and methodsfor obtaining biological molecules usable in downstreamapplications from a sample, and more specifically relates tosystems and methods of extracting nucleic acids from a driedbiological sample for downstream analyses, including sequencinganalyses.
BACKGROUND
[0003] Preserving the structural and functional integrity ofbiological molecules or biomolecules during extraction, storage,isolation, and/or purification from a biological sample isessential for various downstream applications/analyses. Forexample, some of these downstream applications/analyses may includeanalyte detection, sensing, forensic, diagnostic, prognostic,theranostic, and/or therapeutic applications, sequencing,amplification, among other potential uses for these biomolecules.The ultimate success of these downstream applications may depend onmaintaining the integral structure and function of the targetbiomolecules. For example, various factors, such as temperature,humidity, pH, chemical or enzymatic-mediated degradation, or thepresence of contaminants may cause degradation of thebiomolecules.
[0004] RNA is one of the most unstable biomolecules due to chemicalself-hydrolysis and enzyme-mediated degradation. The storage,extraction, and stabilization of RNA derived from a biologicalsample is sensitive to a number of environmental factors including,but not limited to, the substance on or in which the sample isstored, the buffer used to extract or collect the RNA, solution pH,temperature, and the presence of ribonucleases. RNA is typicallystored under refrigeration (e.g., 4.degree. C.--80.degree. C.) inboth purified and unpurified forms to prevent hydrolysis andenzymatic degradation and to preserve the integrity of the RNAsample. As such, it would desirable to develop a methodology inwhich a sample can be obtained and stored at ambient temperaturesand then the RNA and other biomolecules can then be extracted.
[0005] Moreover, scientists looking to perform next-generationsequencing (NGS) must consider the manner and method of samplepreparation. The way that DNA or RNA is isolated from a sample andsubsequently stored, the preparation chosen to construct sequencinglibraries, and the type of sequencing that is being performed, allbecome crucial factors in the experimental design (Baudhuin L. M.(2013) Quality guidelines for next-generation sequencing. Clin Chem59 858-859).
[0006] For RNA sequencing in particular, classes of molecules are,at least in part, defined and sequenced by their size. MicroRNAs(miRNAs; 16-27 nucleotides (nt)), small interfering RNAs (siRNAs;16-27 nt), and PIWI interacting RNAs (piRNA; .about.30 nt) are allpart of a class of small non-coding RNA involved insequence-specific gene silencing (Castel S. E., Martienssen, R. A.(2013) RNA interference in the nucleus: roles for small RNAs intranscription, epigenetics and beyond. Nat 14, 100-112). Whilecurrently known as the smallest functional class, the depth ofsmall RNA's biological significance to regulate gene expression isstill being uncovered some 15 years after discovery (Fire A., XuS., Montgomery M. K., Kostas, et al. (1998) Potent and specificgenetic interference by double-stranded RNA in CaenorhabditisElegans. Nature 391, 806-811.)
[0007] Until recently, methods for isolating RNA from tissues oforigin had been thought to recover all RNA species. Roughly fromlarge to small, RNA as a family of molecules includes coding RNA(mRNA), long noncoding RNA (lncRNA), transfer RNA (tRNA), smallnucleolar RNA (snoRNA), PIWI Interacting RNA (piRNA), and miRNA(Castel S. E., Martienssen, R. A. (2013) RNA interference in thenucleus: roles for small RNAs in transcription, epigenetics andbeyond. Nat 14, 100-112.) The purification of all species of RNA isimplied in the description of many commercially available kits andmethods touting "total" RNA isolation. In fact, it had been usedfor methods that do not recover small RNA at all, such ascolumn-based kits that washed the small RNA off the column duringthe cleaning steps. In addition, other kits used ratios of salt andalcohol that are too low to precipitate small RNA out of solution.There are now many commercially available kits for small RNApurification from which to choose. Systematic testing shows thatthe performance of RNA extraction kits varies quite a bit dependingon the type of sample. Reasonably, different kits may deal with aparticular sample type better than another. For example, a fibroustissue such as muscle has to be handled differently than lipid-richnervous tissue. When available, the best option may be to choose akit specifically designed to deal with the challenges of aparticular type of tissue. There is a need to identify methods tomaximize the amount of RNA extracted from biological samples withany given extraction kit especially when the material islimited.
[0008] The discovery and reliable detection of markers for any typeof disease or condition may be complicated by the relativeinaccessibility of some forms of tissue (e.g., central nervoussystem tissue) or an inability to biopsy or test tissue. RNAsderived from hard to access tissues, such as neurons within thebrain and spinal cord, have the potential to get to the peripherywhere they can be detected non-invasively. The formation andrelease of extracellular microvesicles and RNA binding proteinshave been found to carry RNA from cells to the periphery andprotect the RNA from degradation. Extracellular miRNAs detectablein peripheral circulation can provide information about cellularchanges associated with human health and disease. In order toassociate miRNA signals present in cell-free peripheral biofluids,there is a need to develop systems and methodology for obtaining,storing, extracting, and performing downstream analyses on thesebiofluids.
[0009] The ability to meaningfully profile peripheral biofluids tomonitor and gain insights about the underlying conditions anddiseases would bring significant benefits to monitoring diseaseprogression and treatment efficacy. Development of diagnostic testsand preventative and treatment therapies for diseases andconditions of medical concern is encumbered by the complexity ofpathomechanisms some of these diseases and conditions, as well asthe difficulty of achieving an accurate diagnosis in early,asymptomatic stages of diseases and conditions.
[0010] As such, there is great interest in the identification ofbiomarkers in the blood and other biofluids. However, due theconcerns regarding sampling from CSF (e.g., extensive numbers ofpunctures of the spinal column), large volumes of urine needed forbiomarker extraction, and difficult collection regimens with whichpatients may have to comply (e.g., saliva collection), there is aneed to provide a simple and easily usable methodology for theready collection of biofluids and downstream isolation andprocessing.
[0011] The articles, treatises, patents, references, and publishedpatent applications described above and herein are herebyincorporated by reference in their entirety for all purposes.
SUMMARY
[0012] The present invention is directed to a method of creating abiomarker profile, the method comprising the steps of: obtaining asample of biofluid from a subject, wherein the sample is stored ona sample collection apparatus; removing the sample from the samplecollection apparatus; extracting nucleic acids from the sample;sequencing the extracted nucleic acids to generate sequence data;and analyzing the sequence data using a two-step analyticalmethodology to create the biomarker profile.
[0013] In some aspects, the biofluid is selected from the groupconsisting of blood, plasma, serum, urine, sputum, cerebrospinalfluid, milk, and ductal fluid. In one embodiment, the biofluid is asingle drop of blood.
[0014] In other aspects, the sample collection apparatus comprisescellulose paper on which the biofluid is placed to dry. In oneaspect, the cellulose paper has not been treated with any chemicalstabilizers of nucleic acids.
[0015] In various embodiments, the nucleic acids are RNA. In aparticular embodiment, the RNA is extracellular RNA.
[0016] In certain embodiments, the two-step analytical methodologycomprises: a) determining i) and ii) as: i) the coefficient ofvariance for an RNA transcript in the sample; and ii) thecoefficient of variance of the RNA transcript in a referencesample; and b) removing the RNA transcript from the biomarkerprofile if i) is greater than ii). In one aspect, the referencesample is not allowed to dry prior to extracting nucleic acids.
[0017] In yet other aspects, the sample is obtained from thesubject through a non-invasive methodology such as a fingerprick.
[0018] The present invention also relates to a method ofdetermining the sex of an in utero fetus, the method comprising thesteps of: obtaining a sample of biofluid from a pregnant mother,wherein the sample is stored on a sample collection apparatus;removing the sample from the sample collection apparatus;extracting nucleic acids from the sample; sequencing the extractednucleic acids to generate sequence data; and analyzing the sequencedata to determine the sex of the in utero fetus, wherein the inutero fetus is male if expression of Y chromosome nucleic acids issimilar to or greater than expression of X chromosome nucleic acidsin the sample. In certain aspects, the nucleic acids areextracellular nucleic acids.
[0019] The present invention is also directed to a method ofpredicting onset of a migraine in a subject, the method comprisingthe steps of: obtaining a set of samples of biofluid from thesubject collected over time intervals and stored on samplecollection apparatuses; removing the set of samples from the samplecollection apparatuses; extracting nucleic acids from the set ofsamples; sequencing the extracted nucleic acids to generatesequence data; and analyzing the sequence data to identify suddenincreases in gene expression of ATP binding cassette subfamily Cmember 1 (ABCC1) and/or syntaxin binding protein 3 (STXBP3),wherein a sudden increase in expression of ABCC1 and/or STXBP3indicates onset of a migraine in the subject.
[0020] In certain aspects, a sudden increase in expression is anincrease of at least 5 times, at least 10 times, at least 15 times,at least 20 times, or at least 30 times in a 6-hour period, in a12-hour period, or a 24-hour period.
[0021] In other aspects, the time intervals are 1 hour, 2 hours, 3hours, 4 hours, 5 hours, 6 hours, 8 hours, 10 hours, 12 hours or 24hours.
[0022] In some aspects, the set of samples is a set of single dropsof blood allowed to dry on the sample collection apparatuses.
[0023] In one embodiment, the method further comprises treating thesubject for migraine, wherein treating the migraine comprisesadministering to the subject an effective amount of a non-steroidalanti-inflammatory drug (NSAID), a triptan, an ergotamine,metoclopramide, lidocaine or a combination thereof. Treatment maybe initiated just prior to onset of the migraine.
[0024] In yet other aspects, the present invention relates to amethod of tracking athletic performance in a subject, the methodcomprising the steps of: obtaining a set of samples of biofluidfrom the subject collected before, during, and after aerobicexercise and stored on sample collection apparatuses; removing theset of samples from the sample collection apparatuses; extractingnucleic acids from the set of samples; sequencing the extractednucleic acids to generate sequence data; and analyzing the sequencedata to identify increases in gene expression of dysferlin (DYSF)and/or matrix metallopeptidase 9 (MMP9), wherein an increase inexpression of DYSF and/or MMP9 compared to a reference indicatesimproved athletic performance in the subject.
[0025] In one aspect, the reference is a measurement of expressionof DYSF and/or MMP9 in a set of samples from the subject determinedfrom an earlier time point in the athletic training of thesubject.
[0026] In certain aspects, improved athletic performance isindicated by increased endurance, greater muscle strength or acombination thereof.
BRIEF DESCRIPTION OF THE DRAWINGS
[0027] FIG. 1 depicts the standard curve used to calculate the RNAtotal yields in Table 1.
[0028] FIG. 2 depicts the standard curve used to calculate the RNAtotal yields in Table 2.
[0029] FIG. 3 depicts the standard curve used to calculate the RNAtotal yields in Table 3.
[0030] FIG. 4 depicts the standard curve used to calculate the RNAtotal yields in Table 4.
[0031] FIG. 5 depicts an assessment of the quality of RNApreparations from drops of wet whole blood by capillaryelectrophoresis.
[0032] FIG. 6 depicts an analysis of the integrity of RNApreparations from drops of wet whole blood performed on an AgilentBioanalyzer. The amount of RNA is depicted in fluorescence units(FU) for RNA molecules ranging in size from <25 nt to >4,000nt.
[0033] FIG. 7 depicts an assessment of the quality of RNApreparations from dried blood spots previously dried on a samplecollection apparatus by capillary electrophoresis.
[0034] FIG. 8 depicts an analysis of the integrity of RNApreparations from dried blood spots previously dried on a samplecollection apparatus performed on an Agilent Bioanalyzer. Theamount of RNA is depicted in FU for RNA molecules ranging in sizefrom <25 nt to >4,000 nt.
[0035] FIG. 9 depicts an assessment of the quality of RNApreparations from dried blood spots by capillaryelectrophoresis.
[0036] FIG. 10 depicts analysis of the integrity of RNApreparations from dried blood spots performed on an AgilentBioanalyzer. The amount of RNA is depicted in FU for RNA moleculesranging in size from <25 nt to >4,000 nt.
[0037] FIG. 11 depicts an assessment of the quality of RNApreparations from dried blood spots collected from a subject atseveral time points before and after exercising by capillaryelectrophoresis.
[0038] FIG. 12 depicts analysis of the integrity of RNApreparations from dried blood spots collected from a subject atseveral time points before and after exercising performed on anAgilent Bioanalyzer. The amount of RNA is depicted in FU for RNAmolecules ranging in size from <25 nt to >4,000 nt.
[0039] FIG. 13 depicts an assessment of the quality of RNApreparations from dried blood spots collected from a subject atseveral time points before and after exercising by capillaryelectrophoresis.
[0040] FIG. 14 depicts analysis of the integrity of RNApreparations from dried blood spots collected from a subject atseveral time points before and after exercising performed on anAgilent Bioanalyzer. The amount of RNA is depicted in FU for RNAmolecules ranging in size from <25 nt to >4,000 nt.
[0041] FIG. 15A depicts box plots of total RNA yield values fromdried blood samples collected on FORTIUSBIO.RTM. RNASOUND.TM. bloodsampling cards, WHATMAN.RTM.903 Protein Saver cards or WHATMAN.RTM.FTA.RTM. non-indicating Elute Micro blood cards.
[0042] FIG. 15B depicts box plots with the same experimental dataas that presented in FIG. 15A except the values are shown on a logscale.
[0043] FIG. 16 depicts normalized counts of DYSF (dysferlin) andMMP9 (matrix metallopeptidase 9) analyzed in dried blood samplesfrom a human subject at 5 am and 9 am (pre-exercise), at ten minuteintervals during exercise, and hourly post-exercise.
[0044] FIG. 17 depicts analysis of cell-free RNA in maternal plasmaand measurement of the expression of biomarkers specific to the Xchromosome or to the Y chromosome to determine fetal sex.
[0045] FIG. 18 depicts two time series of dried blood samplecollection and analysis conducted with a human subject wheresamples were drawn before, during, and after onset of a migraine.Onset of migraine is indicated in the charts by the vertical graybar between days 5 and 6.
DETAILED DESCRIPTION
[0046] Some embodiments of the invention provide methods forobtaining, storing, isolating, extracting, and/or analyzing one ormore biomolecules from a sample. For example, some embodiments ofthe invention can be intended to be used subjects to regularlyobtain samples from the subjects on a regular or irregular/frequentor infrequent basis. In some aspects, embodiments of the instantinvention can be used to obtain samples from the subjects inrelatively small volumes or other quantities at regular intervals.By way of example only, some aspects of the invention can beemployed in obtaining, storing, isolating, extracting, and/oranalyzing biomolecules in a small volume of biofluid (e.g.,drop-like quantities of blood, plasma, serum, cerebrospinal fluid,urine, saliva, etc.). Moreover, in some embodiments, themethodologies of the instant invention can be used in conjunctionwith the small quantity of biofluid to obtain multiple samples froma single subject, potentially over an extended period of time(e.g., longitudinal samples from one or more subjects). In otherwords, due to the relatively small quantities required for use withthe instant methodologies, the subjects may be able to obtainsamples for downstream analyses on a regular basis (e.g., minutes,hours, days, weeks, months, years, etc.).
[0047] In some embodiments, the methodology of the instantinvention can be used in conjunction with theidentification/analysis of one or more markers of one or morediseases, conditions, medical states, etc. For example, in someembodiments, methodologies of the invention can be used to identifyand/or analyze one or more biomarkers associated with a disease,condition, and/or medical state using a sample of relatively smallquantities. As such, embodiments of the invention can be employedin medically related analyses to diagnose, assess, provideprognostic information, and make therapeutic decisions regardingany biologically related state. In other words, any state of thesubject may be assessed using some embodiments of theinvention.
[0048] As used herein, the verb "comprise" as is used in thisdescription and in the claims and its conjugations are used in itsnon-limiting sense to mean that items following the word areincluded, but items not specifically mentioned are not excluded. Inaddition, reference to an element by the indefinite article "a" or"an" does not exclude the possibility that more than one of theelements are present, unless the context clearly requires thatthere is one and only one of the elements. The indefinite article"a" or "an" thus usually means "at least one."
[0049] As used herein, the term "subject" or "patient" refers toany vertebrate including, without limitation, humans and otherprimates (e.g., chimpanzees and other apes and monkey species),farm animals (e.g., cattle, sheep, pigs, goats and horses),domestic mammals (e.g., dogs and cats), laboratory animals (e.g.,rodents such as mice, rats, and guinea pigs), and birds (e.g.,domestic, wild and game birds such as chickens, turkeys and othergallinaceous birds, ducks, geese, and the like). In someembodiments, the subject is a mammal. In other embodiments, thesubject is a human.
[0050] As used herein the term "diagnosing" or "diagnosis" refersto the process of identifying a medical condition or disease by itssigns, symptoms, and in particular from the results of variousdiagnostic procedures, including e.g. detecting the expression ofthe nucleic acids according to at least some embodiments of theinvention in a biological sample obtained from an individual.Furthermore, as used herein the term "diagnosing" or "diagnosis"encompasses screening for a disease, screening for the presenceand/or absence of a condition, such as a medical condition,detecting a presence or a severity of a disease, distinguishing adisease from other diseases including those diseases that mayfeature one or more similar or identical symptoms, providingprognosis of a disease, monitoring disease progression or relapse,as well as assessment of treatment efficacy and/or relapse of adisease, disorder or condition, as well as selecting a therapyand/or a treatment for a disease, optimization of a given therapyfor a disease, monitoring the treatment of a disease, and/orpredicting the suitability of a therapy for specific patients orsubpopulations or determining the appropriate dosing of atherapeutic product in patients or subpopulations. The diagnosticprocedure can be performed in vivo or in vitro. In someembodiments, the methodologies according to the invention can beused in diagnosing diseases, conditions, etc. using samples ofrelatively small volumes of biofluids, such as one or more drops ofblood.
[0051] "Detection" as used herein refers to detecting the presenceof a component (e.g., a nucleic acid sequence) in a sample.Detection also means detecting the absence of a component.Detection also means measuring the level of a component, eitherquantitatively or qualitatively. With respect to the method of theinvention, detection also means identifying or diagnosing one ormore conditions or stages I likely successful therapeutic solutionsin a subject. "Early detection" as used herein refers toidentifying or diagnosing conditions or diseases in a subject at anearly stage of the disease or condition (e.g., before there are anydetectable/noticeable symptoms).
[0052] "Differential expression" as used herein refers toqualitative or quantitative differences in the temporal and/orcellular expression patterns of an RNA transcript and/or translatedpeptide I protein within and among cells and tissue. For example,differentially expressed transcripts can qualitatively have itsexpression altered, including an activation or inactivation, in,e.g., normal versus disease I altered state/condition tissue.Genes, for instance, may be turned on or turned off in a particularstate, relative to another state thus permitting comparison of twoor more states. A qualitatively regulated gene or transcript mayexhibit an expression pattern within a state or cell type that maybe detectable by standard techniques. Some transcripts will beexpressed in one state or cell type, but not in both.Alternatively, the difference in expression may be quantitative,e.g., in that expression is modulated, up-regulated, resulting inan increased amount of transcript, or down-regulated, resulting ina decreased amount of transcript. The degree to which expressiondiffers need only be large enough to quantify via standardcharacterization techniques such as expression arrays, quantitativereverse transcriptase PCR, northern analysis, wholetranscriptome/RNA sequencing, RNase protection, and any othermethods now known or developed in the future.
[0053] In some embodiments, the term "level" refers to theexpression level of a nucleic acid according to at least someembodiments of the present invention. Typically the level of thenucleic acid in a biological sample obtained from the subject isdifferent (e.g., increased or decreased) from the level of the samenucleic acid in a similar sample obtained from a healthy individual(examples of biological samples are described herein).Alternatively, the level of the nucleic acid in a biological sampleobtained from the subject is different (e.g., increased) from thelevel of the same nucleic acid in a similar sample obtained fromthe same subject at an earlier time point. Alternatively, the levelof the nucleic acid in a biological sample obtained from thesubject is different (e.g., increased) from the level of the samenucleic acid in a non-diseased tissue obtained from said subject.Typically, the expression levels of the nucleic acid of theinvention are independently compared to their respective controllevel.
[0054] The term "expression level" is used broadly to include agenomic expression profile, e.g., an expression profile of nucleicacids. Profiles may be generated by any convenient means fordetermining a level of a nucleic acid sequence e.g. quantitativehybridization of nucleic acid, labeled nucleic acid, amplifiednucleic acid, cDNA, etc., quantitative PCR, ELISA for quantitation,sequencing (e.g., RNA sequencing) and the like, and allow theanalysis of differential gene expression between two samples. Asubject or sample, e.g., cells or collections thereof (e.g.,tissues, fluids, etc.) is assayed. Samples are collected by anyconvenient method, as known in the art. According to someembodiments, the term "expression level" means measuring theabundance of the nucleic acid in the measured samples.
[0055] Expression level or other determinable traits regardingnucleic acids may function as one or more markers. As describedherein, the markers are preferably then correlated with thepresence or stage of a disease, condition, or medical state. Forexample, such correlating may optionally comprise determining theconcentration of each of the plurality of markers, and individuallycomparing each marker concentration (e.g., expression level) to athreshold level. Optionally, if the marker concentration is abovethe threshold level, the marker concentration correlates withdiseases, conditions and possibly stages thereof. Optionally, aplurality of marker concentrations correlates with neurologicalconditions and stages I treatments thereof. Alternatively, suchcorrelating may optionally comprise determining the concentrationof each of the plurality of markers, calculating a single indexvalue based on the concentration of each of the plurality ofmarkers, and comparing the index value to a threshold level. Alsoalternatively, such correlating may optionally comprise determininga temporal change in at least one of the markers, and wherein thetemporal change is used in the correlating step.
[0056] A marker panel may be analyzed in a number of fashions wellknown to those of skill in the art. For example, each member of apanel may be compared to a "normal" value, or a value indicating aparticular outcome. A particular diagnosis/prognosis may dependupon the comparison of each marker to this value; alternatively, ifonly a subset of markers is outside of a normal range, this subsetmay be indicative of a particular diagnosis/prognosis. The skilledartisan will also understand that diagnostic markers, differentialdiagnostic markers, prognostic markers, time of onset markers,disease or condition differentiating markers, etc., may be combinedin a single assay or device. Markers may also be commonly used formultiple purposes by, for example, applying a different thresholdor a different weighting factor to the marker for the differentpurpose(s).
[0057] In the methods of the invention, a "significant elevation"in expression levels of the plurality of markers/nucleic acidsrefers, in different embodiments, to a statistically significantelevation, or in other embodiments to a significant elevation asrecognized by a skilled artisan. In additional embodiments, asignificant elevation refers to an increase in the expression of aplurality of markers/nucleic acids.
[0058] The term "about" as used herein refers to +/-10%.
[0059] Diagnostic methods differ in their sensitivity andspecificity. The "sensitivity" of a diagnostic assay is thepercentage of diseased individuals who test positive (percent of"true positives"). Diseased individuals not detected by the assayare "false negatives". Subjects who are not diseased and who testnegative in the assay are termed "true negatives". The"specificity" of a diagnostic assay is 1 minus the false positiverate, where the "false positive" rate is defined as the proportionof those without the disease who test positive. While a particulardiagnostic method may not provide a definitive diagnosis of acondition, it suffices if the method provides a positive indicationthat aids in diagnosis.
[0060] Diagnosis of a disease, condition, or medical stateaccording to at least some embodiments of the present invention canbe affected by determining a level of a polynucleotide according toat least some embodiments of the present invention in a biologicalsample obtained from the subject, wherein the level determined canbe correlated with predisposition to, or presence or absence of thedisease.
[0061] The term "sample" or "biological sample" as used hereinmeans a sample of biological tissue or fluid/biofluid or anexcretion sample that may comprise biological molecules, such asnucleic acids. Such samples include, but are not limited to, tissueor fluid isolated from subjects. Biological samples may alsoinclude sections of tissues such as biopsy and autopsy samples,frozen sections, blood, plasma, serum (SER), sputum, stool andmucus from a living or deceased subject. In some specificembodiments, the sample may comprise a small volume of a biofluid,such as blood. For example, in some aspects, the sample maycomprise one or more drops of blood that have been obtained from afinger puncture of the subject. Biological sample also refers toorgans such as liver, lung, and peritoneum. Biological samples alsoinclude explants and primary and/or transformed cell culturesderived from animal or patient tissues. Biological samples may alsobe blood, a blood fraction, gastrointestinal secretions, or tissuesample. A biological sample may be provided by removing a sample ofcells from an animal, but can also be accomplished by usingpreviously isolated cells (e.g., isolated by another person, atanother time, and/or for another purpose), or by performing themethods described herein in vivo. Archival tissues, such as thosehaving treatment or outcome history, may also be used.
[0062] In some embodiments the sample obtained from the subject isa body fluid or excretion sample including but not limited toseminal plasma, blood, SER, urine, prostatic fluid, seminal fluid,semen, the external secretions of the skin, respiratory,intestinal, and genitourinary tracts, tears, CSF, sputum, saliva,milk, peritoneal fluid, pleural fluid, peritoneal fluid, cystfluid, lavage of body cavities, broncho alveolar lavage, lavage ofthe reproductive system and/or lavage of any other organ of thebody or system in the body, and stool.
[0063] Numerous well known tissue or fluid collection methods canbe utilized to collect the biological sample from the subject inorder to determine the expression level of the biomarkers of theinvention in said sample of said subject.
[0064] Examples include, but are not limited to, blood sampling,urine sampling, stool sampling, sputum sampling, aspiration ofpleural or peritoneal fluids, fine needle biopsy, needle biopsy,core needle biopsy and surgical biopsy, and lavage. Regardless ofthe procedure employed, once a biopsy/sample is obtained the levelof the markers/nucleic acids can be determined and a diagnosis canthus be made.
[0065] In some embodiments, the sample can be collected and/orstored using a sample collection apparatus. In some embodiments,the sample collection apparatus can be configured and arranged toreceive a liquid biofluid and to enable storage of that liquidbiofluid at ambient and/or refrigerated temperatures. For example,in some aspects, the sample collection apparatus can be configuredto receive the biofluid such that the biofluid can be absorbedinto/within the structure of the apparatus for drying purposes.Moreover, in some aspects, the sample collection apparatus can beconfigured to provide a relatively nuclease-free environment forthe biofluid. For example, in some aspects, the sample collectionapparatus can be prepared such that it is substantially orcompletely free of nucleases (e.g., any enzymes that may degrade ordestroy nucleic acids, such as RNA). As such, the sample collectionapparatus may function to preserve the state of some or all of thenucleic acids contained within the biofluid, includingintracellular and extracellular nucleic acids, such as RNAcontained within exosomes.
[0066] In some embodiments, the sample collection apparatus may bea solid substance. In some aspects, the sample collection apparatusmay be cellulose paper, Whatman paper, bibulous paper, cotton-basedpaper, cotton-based fabric, or any other substance that can beconfigured to receive the sample. By way of example only, in someaspects, the sample collection apparatus may be an RNA extractionstrip card from FORTIUSBIO.RTM.. In other aspects, the samplecollection apparatus may be a plasma-concentration device fromShimadzu-.
[0067] The term "nucleic acid" or "polynucleotide" as referred toherein comprises all forms of RNA (mRNA, miRNA, rRNA, tRNA, piRNA,ncRNA), DNA (genomic DNA or mtDNA), as well as recombinant RNA andDNA molecules or analogues of DNA or RNA generated using nucleotideanalogues. The nucleic acids may be single stranded or doublestranded. The nucleic acids may include the coding or non-codingstrands. The term also comprises fragments of nucleic acids, suchas naturally occurring RNA or DNA which may be recovered using oneor more extraction methods disclosed herein. "Fragment" refers to aportion of a nucleic acid (e.g., RNA or DNA).
[0068] In some aspects, mRNA and/or miRNAs can be used inembodiments of the methodology. miRNAs are a large class of singlestrand RNA molecules of approximately 16-25 nucleotides, involvedin post transcriptional gene silencing. Eighty percent of conservedmiRNA show tissue-specific expression and play an important role incell fate determination, proliferation, and cell death (Lee andDutta. Annu. Rev. Pathol. Mech. Dis. 2009; 4: 199-227; Ross,Carlson and Brock, Am J Clin Path 2007: 128; 830-836). miRNAs arisefrom intergenic or intragenic (both exonic and intronic) genomicregions that are transcribed as long primary transcripts(pri-microRNA) and undergo a number of processing steps to producethe final short mature molecule (Massimo et al., Current Op. inCell Biol. 2009: 21; 1-10).
[0069] The mature miRNAs suppress gene expression based on theircomplementarity to a part of one or more mRNAs usually in the 3'UTR site. The annealing of miRNA to the target transcript eitherblocks protein translation or destabilizes the transcript andtriggers the degradation or both. Most of the miRNA action ontarget mRNA translation is based on the partial complementarity,therefore conceivably one miRNA may target more than one mRNA andmany miRNAs may act on one mRNA (Ying at el., Mol. Biotechnol.2008: 38; 257-268). In humans, approximately one-third of miRNAsare organized into clusters. A given cluster is likely to be asingle transcriptional unit, suggesting a coordinated regulation ofmiRNAs in the cluster (Lee and Dutta. ibid).
[0070] The term "extracellular miRNA" means that the miRNA isfound, located or circulates in a biofluid (biological fluid). Forclarity, the term "extracellular miRNA" includes any one or more ofmiRNA found in exosomes or in other vesicles of cellular origin,miRNA originating from cells or more generally being of cellularorigin, or being cellular isolates.
[0071] Biofluid can be, for example, blood, plasma, serum, urine,sputum, cerebrospinal fluid, milk, or ductal fluid, and can befresh, frozen or fixed. For clarity, biofluid can comprise cells,cellular isolates, lysed cells or any type of cellular material. Insome embodiments, the biofluid is blood, plasma or serum.
[0072] Although, there is currently no definitive source identifiedfor extracellular miRNAs--i.e. a definitive source leading tomiRNAs locating within biofluids--blood cells in particularreticulocytes, myeloid cells, lymphoid cells, platelets, cells fromthe liver, lungs and kidneys or lysed cells may release miRNAs intothe circulation. Similarly, miRNAs may be discharged intobiofluid/plasma following tissue damage, for example, followingacute myocardial infarction.
[0073] There are a number of considerations when choosing protocolsboth upstream and downstream of NGS experiments. On the front end,purification methods, additives, and residuum can often inhibit thesensitive chemistries by which sequencing-by-synthesis isperformed. On the back end, data handling, analysis softwarepackages, and pipelines can also impact sequencing outcomes. Thepresent invention provides methods of preparing biological samples(e.g., acellular biofluid samples) for small RNA sequencing.
[0074] The term "extraction" as used herein refers to any methodfor separating or isolating the nucleic acids from a sample, moreparticularly from a biological sample, such as blood. Nucleic acidssuch as RNA or DNA may be released, for example, by cell lysis.Moreover, in some aspects, extraction may also encompass theseparation or isolation of extracellular RNAs (e.g., extracellularmiRNAs) from one or more extracellular structures, such asexosomes.
[0075] Some embodiments of the invention include the extraction ofone or more forms of nucleic acids from one or more samples. Insome aspects, the extraction of the nucleic acids can providedusing one or more techniques known in the art. For example, in someaspects, the extraction steps can be accomplished using theQIAAMP.RTM. RNA Blood Kit from QIAGEN.RTM. (e.g., for the isolationof total RNA) or EXORNEASY.RTM. Serum/Plasma Kit from QIAGEN.RTM.(e.g., for the isolation of intracellular and/or extracellularRNA). In other embodiments, methodologies of the invention can useany other conventional methodology and/or product intended for theisolation of intracellular and/or extracellular nucleic acids(e.g., RNA).
[0076] In one embodiment, the present invention provides methods ofsequencing the full profile of nucleic acids (e.g., RNA) from abiological sample (e.g., blood). In certain aspects, the presentinvention provides a method of obtaining enough RNA from biofluidsamples to perform RNA sequencing. With the prior art methods itwas difficult to make sufficient scientifically verifiableconclusions from the biofluid samples because these conventionalmethodologies did not employ some of the advances contained herein,including perform RNA sequencing. As described herein, theinventors provide methods diagnosing and identifying diseases,conditions, and medical states as the expression of the nucleicacids change with various conditions.
[0077] The present invention also provides for the sequencing ofRNA from samples (i.e., blood/plasma) from subjects. The RNA isuseful as marker(s) for various diseases, conditions, and medicalstates as the expression of the RNAs change with diseaseseverity/stage/outcome, age, etc. Commercial value resides in theability to use a relatively small volume of sample from the subject(e.g., a drop of blood) to obtain significant clinical information.Moreover, additional value resides in the fact that multiplesamples can be obtained from one or more subjects over a multitudeof time intervals (e.g., samples obtained every minute, hour, day,week, month, etc.) such that those reviewing the results of thesequencing can gain a clearer resolution of the subject's medicalstate.
[0078] In some embodiments, the purified RNA from the biologicalsample is analyzed by Sequencing by Synthesis (SBS) techniques. SBStechniques generally involve the enzymatic extension of a nascentnucleic acid strand through the iterative addition of nucleotidesagainst a template strand. In traditional methods of SBS, a singlenucleotide monomer may be provided to a target nucleotide in thepresence of a polymerase in each delivery. However, in some of themethods described herein, more than one type of nucleotide monomercan be provided to a target nucleic acid in the presence of apolymerase in a delivery.
[0079] SBS can utilize nucleotide monomers that have a terminatormoiety or those that lack any terminator moieties. Methodsutilizing nucleotide monomers lacking terminators include, forexample, pyrosequencing and sequencing using.gamma.-phosphate-labeled nucleotides. In methods using nucleotidemonomers lacking terminators, the number of different nucleotidesadded in each cycle can be dependent upon the template sequence andthe mode of nucleotide delivery. For SBS techniques that utilizenucleotide monomers having a terminator moiety, the terminator canbe effectively irreversible under the sequencing conditions used asis the case for traditional Sanger sequencing which utilizesdideoxynucleotides, or the terminator can be reversible as is thecase for sequencing methods developed by Solexa (now Illumina,Inc.). In preferred methods a terminator moiety can be reversiblyterminating.
[0080] SBS techniques can utilize nucleotide monomers that have alabel moiety or those that lack a label moiety. Accordingly,incorporation events can be detected based on a characteristic ofthe label, such as fluorescence of the label; a characteristic ofthe nucleotide monomer such as molecular weight or charge; abyproduct of incorporation of the nucleotide, such as release ofpyrophosphate; or the like. In embodiments, where two or moredifferent nucleotides are present in a sequencing reagent, thedifferent nucleotides can be distinguishable from each other, oralternatively, the two or more different labels can be theindistinguishable under the detection techniques being used. Forexample, the different nucleotides present in a sequencing reagentcan have different labels and they can be distinguished usingappropriate optics as exemplified by the sequencing methodsdeveloped by Solexa (now Illumina, Inc.). However, it is alsopossible to use the same label for the two or more differentnucleotides present in a sequencing reagent or to use detectionoptics that do not necessarily distinguish the different labels.Thus, in a doublet sequencing reagent having a mixture of A/C boththe A and C can be labeled with the same fluorophore. Furthermore,when doublet delivery methods are used all of the differentnucleotide monomers can have the same label or different labels canbe used, for example, to distinguish one mixture of differentnucleotide monomers from a second mixture of nucleotide monomers.For example, using the [First delivery nucleotide monomers]+[Seconddelivery nucleotide monomers] nomenclature set forth above andtaking an example of A/C+(1/T), the A and C monomers can have thesame first label and the G and T monomers can have the same secondlabel, wherein the first label is different from the second label.Alternatively, the first label can be the same as the second labeland incorporation events of the first delivery can be distinguishedfrom incorporation events of the second delivery based on thetemporal separation of cycles in an SBS protocol. Accordingly, alow resolution sequence representation obtained from such mixtureswill be degenerate for two pairs of nucleotides (T/G, which iscomplementary to A and C, respectively; and C/A which iscomplementary to G/T, respectively).
[0081] Some embodiments include pyrosequencing techniques.Pyrosequencing detects the release of inorganic pyrophosphate (PPi)as particular nucleotides are incorporated into the nascent strand(Ronaghi, M., Karamohamed, S., Pettersson, B., Uhlen, M. and Nyren,P. (1996) "Real-time DNA sequencing using detection ofpyrophosphate release." Analytical Biochemistry 242(1), 84-9;Ronaghi, M. (2001) "Pyrosequencing sheds light on DNA sequencing."Genome Res. 11(1), 3-11; Ronaghi, M., Uhlen, M. and Nyren, P.(1998) "A sequencing method based on real-time pyrophosphate."Science 281(5375), 363; U.S. Pat. No. 6,210,891; U.S. Pat. No.6,258,568 and U.S. Pat. No. 6,274,320, the disclosures of which areincorporated herein by reference in their entireties). Inpyrosequencing, released PPi can be detected by being immediatelyconverted to adenosine triphosphate (ATP) by ATP sulfurylase, andthe level of ATP generated is detected via luciferase-producedphotons.
[0082] In another example type of SBS, cycle sequencing isaccomplished by stepwise addition of reversible terminatornucleotides containing, for example, a cleavable or photobleachabledye label as described, for example, in U.S. Pat. No. 7,427,67,U.S. Pat. No. 7,414,1163 and U.S. Pat. No. 7,057,026, thedisclosures of which are incorporated herein by reference. Thisapproach is being commercialized by Solexa (now Illumina Inc.), andis also described in WO 91/06678 and WO 07/123,744 (filed in theUnited States Patent and Trademark Office as U.S. Ser. No.12/295,337), each of which is incorporated herein by reference intheir entireties. The availability of fluorescently-labeledterminators in which both the termination can be reversed and thefluorescent label cleaved facilitates efficient cyclic reversibletermination (CRT) sequencing. Polymerases can also be co-engineeredto efficiently incorporate and extend from these modifiednucleotides.
[0083] In other embodiments, Ion Semiconductor Sequencing isutilized to analyze the purified RNA from the sample. IonSemiconductor Sequencing is a method of DNA sequencing based on thedetection of hydrogen ions that are released during DNAamplification. This is a method of"sequencing by synthesis," duringwhich a complementary strand is built based on the sequence of atemplate strand.
[0084] For example, a microwell containing a template DNA strand tobe sequenced can be flooded with a single species ofdeoxyribonucleotide (dNTP). If the introduced dNTP is complementaryto the leading template nucleotide it is incorporated into thegrowing complementary strand. This causes the release of a hydrogenion that triggers a hypersensitive ion sensor, which indicates thata reaction has occurred. If homopolymer repeats are present in thetemplate sequence multiple dNTP molecules will be incorporated in asingle cycle. This leads to a corresponding number of releasedhydrogens and a proportionally higher electronic signal.
[0085] This technology differs from other sequencing technologiesin that no modified nucleotides or optics are used. Ionsemiconductor sequencing may also be referred to as ion torrentsequencing, proton-mediated sequencing, silicon sequencing, orsemiconductor sequencing. Ion semiconductor sequencing wasdeveloped by Ion Torrent Systems Inc. and may be performed using abench top machine. Rusk, N. (2011). "Torrents of Sequence," NatMeth 8(1): 44-44. Although it is not necessary to understand themechanism of an invention, it is believed that hydrogen ion releaseoccurs during nucleic acid amplification because of the formationof a covalent bond and the release of pyrophosphate and a chargedhydrogen ion. Ion semiconductor sequencing exploits these facts bydetermining if a hydrogen ion is released upon providing a singlespecies of dNTP to the reaction.
[0086] For example, microwells on a semiconductor chip that eachcontain one single-stranded template DNA molecule to be sequencedand one DNA polymerase can be sequentially flooded with unmodifiedA, C, G or T dNTP. Pennisi, E. (2010). "Semiconductors inspire newsequencing technologies" Science 327(5970): 1190; and Perkel, J.,"Making contact with sequencing's fourth generation" Biotechniques(2011). The hydrogen ion that is released in the reaction changesthe pH of the solution, which is detected by a hypersensitive ionsensor. The unattached dNTP molecules are washed out before thenext cycle when a different dNTP species is introduced.
[0087] Beneath the layer of microwells is an ion sensitive layer,below which is a hypersensitive ISFET ion sensor. All layers arecontained within a CMOS semiconductor chip, similar to that used inthe electronics industry. Each released hydrogen ion triggers theISFET ion sensor. The series of electrical pulses transmitted fromthe chip to a computer is translated into a DNA sequence, with nointermediate signal conversion required. Each chip contains anarray of microwells with corresponding ISFET detectors. Becausenucleotide incorporation events are measured directly byelectronics, the use of labeled nucleotides and opticalmeasurements are avoided.
[0088] An example of a Ion Semiconductor Sequencing techniquesuitable for use in the methods of the provided disclosure is IonTorrent sequencing (U.S. Patent Application Numbers 2009/0026082,2009/0127589, 2010/0035252, 2010/0137143, 2010/0188073,2010/0197507, 2010/0282617, 2010/0300559), 2010/0300895,2010/0301398, and 2010/0304982), the content of each of which isincorporated by reference herein in its entirety. In Ion Torrentsequencing, DNA is sheared into fragments of approximately 300-800base pairs, and the fragments are blunt ended. Oligonucleotideadaptors are then ligated to the ends of the fragments. Theadaptors serve as primers for amplification and sequencing of thefragments. The fragments can be attached to a surface and areattached at a resolution such that the fragments are individuallyresolvable. Addition of one or more nucleotides releases a proton(H+), which signal detected and recorded in a sequencinginstrument. The signal strength is proportional to the number ofnucleotides incorporated. User guides describe in detail the IonTorrent protocol(s) that are suitable for use in methods of theinvention, such as Life Technologies' literature entitled "IonSequencing Kit for User Guide v. 2.0" for use with their sequencingplatform the Personal Genome Machine.TM. (PCG).
[0089] In some embodiments, as a part of the sample preparationprocess, "barcodes" may be associated with each sample. In thisprocess, short oligonucleotides are added to primers, where eachdifferent sample uses a different oligo in addition to aprimer.
[0090] The term "library", as used herein refers to a library ofgenome/transcriptome-derived sequences. The library may also havesequences allowing amplification of the "library" by the polymerasechain reaction or other in vitro amplification methods well knownto those skilled in the art. The library may also have sequencesthat are compatible with next-generation high throughput sequencerssuch as an ion semiconductor sequencing platform.
[0091] The term "non-invasive" as used herein refers to a method ofobtaining a sample from a subject in which the subject experienceslittle discomfort in the sample extraction process and the processitself may require little to know anesthesia or analgesia. Forexample, a non-invasive methodology may be a finger-prick procedurein which a small lancet is used to pierce the finger skin of asubject to obtain a drop of blood. Similar non-invasivemethodologies can be used to obtain the sample.
[0092] In certain embodiments, the primers and barcodes are ligatedto each sample as part of the library generation process. Thusduring the amplification process associated with generating the ionamplicon library, the primer and the short oligo are alsoamplified. As the association of the barcode is done as part of thelibrary preparation process, it is possible to use more than onelibrary, and thus more than one sample. Synthetic nucleic acidbarcodes may be included as part of the primer, where a differentsynthetic nucleic acid barcode may be used for each library. Insome embodiments, different libraries may be mixed as they areintroduced to a flow cell, and the identity of each sample may bedetermined as part of the sequencing process. Sample separationmethods can be used in conjunction with sample identifiers. Forexample a chip could have 4 separate channels and use 4 differentbarcodes to allow the simultaneous running of 16 differentsamples.
[0093] As described in greater detail in the Examples sectionbelow, in some embodiments, after the RNA from the sample issequenced, some embodiments provide methods of analyzing the data.For example, the analyzing steps of the methodology include stepssuch as processing the raw sequencing data/reads to removeinformation related to barcodes and adapters using technologiesprovided by Cutadapt and AlienTrimmer. Thereafter, the sequencescan be aligned to a reference sequence using technologies such asSTAR or Tophat and, after alignment, the data can be quantitated togenerate numerical estimates of each gene's expression or "counts"provided by technologies like FeatureCounts or htseq-count. Theprinciple issue in these matters is recognizing variance in thesecounts due to technical methods that may not represent biologicalsignificance.
[0094] In some conventional methodologies that may not employbiological samples of drop-sized volumes, the variance issuedescribed above can be addressed with a generally simple expressioncut off (i.e., any gene expression detected with >n countsusually exhibits low variance among replicates). This conventionalmethodology may not be suitable for small volume blood-basedsamples because the process of drying the small volume of blood inor on the sample collection apparatus may impart a non-uniformeffect across RNA transcripts of variable length, possibly due tothe RNA's biochemical structure and stability. In other words, thesample collection, drying, and storing process can create data thatis difficult to analyze and/or rely upon when making clinicaldeterminations.
[0095] In order to address the aforementioned issue, the inventorshave started to survey the stability of different RNAs in collectedsamples (e.g., dried blood spots) by sequencing technicalreplicates and calculating the coefficient of variance (CV) foreach transcript. By calculating the CV, the inventors were able togather information related to each RNA's stability during thedrying process and determine each RNA's potential accuracy as abiomarker. The CV value for each RNA is then employed in a two-stepfiltering process.
[0096] The first step of this analytical process includes thecreation of a database of the CV values for one or more of the RNAsobtained from the samples. The database includes the CV valuesusing non-dried blood spot RNA sequencing data to interrogate adozen technical replicates of a control RNA sample. In someaspects, the control RNA sample can be from a known cell type, suchas the HEK cell line. This interrogation process allows theinvestigators to filter subject (e.g., human) RNA transcriptsexhibiting high variance likely due only to technical reasonsbecause the replicates are technical in nature, rather thanbiological.
[0097] The second step of this filtering approach includes creatingrelatively specific CV databases for each sample type and methodsof extraction, library preparation, sequencing methodology, etc.The information in these specific CV databases can be used tofilter sample-/project-specific technical variance so that the bestRNAs can be selected as markers for medical purposes. Moreover,some aspects of this two-step analytical methodology can beemployed with other RNA sequencing-based methodologies.
[0098] Embodiments of the invention provide a method of analyzing asample from a subject. Some aspects include obtaining a samplecontained on and/or within a sample collection apparatus.Thereafter, nucleic acids can be obtained from the sample and thenucleic acids (e.g., intracellular and/or extracellular RNA) can beprocessed and sequenced to obtain information about the biologicalstate of the subject. The data obtained from the sequencing stepscan be processed using a two-step algorithm to determine whichnucleic acids can provide the most reliable information.
[0099] Relative to conventional technologies, some embodiments ofthe invention offer improvements. For example, some embodimentsrequire as a little as a single drop of blood contained/dried on asample collection apparatus to gain valuable insight into thebiological/medical state of the subject. As such, the requirementfor obtaining one or more vials of blood can be removed as abarrier to obtaining accurate information about a subject.Moreover, subjects/patients can largely obtain these samplesthemselves. Although a medical professional does have thecapability of obtaining these small volume samples, subjectswithout medical training can be instructed on how to obtain thesamples. For example, one or more drops of blood can be obtainedfrom a subject using known methodologies, such as a finger stickthat is now currently used to regularly obtain blood forblood-glucose testing. As such, embodiments of this inventionprovide simplistic sample collection opportunities for subjects.Further, subjects can obtain regular longitudinal samples becauseof the relatively non-invasive nature of some embodiments of theinvention and the ease with which subjects can obtain thesample(s). In other words, subjects can provide multiple samplesobtained over varying time periods to medical professionals. Forexample, a subject can take a single drop of blood each hour of aday (i.e., 24 samples) and allow that blood to dry on the samplecollection apparatus and provide that apparatus for nucleic-acidextraction and processing for biological/medical state analysis.This can provide a medical professional with much greaterresolution in terms of assessing the biological/medical state ofthe subject.
[0100] Some embodiments of the inventive methodology can be usedwith any specific applications that may require only a limitedvolume of sample. Some embodiments of the invention can be used inconjunction with testing subjects that may not be able to provide asample of significant volume. For example, some embodiments may beused in conjunction with the testing of neonates and otherembodiments may be used in conjunction with the testing of one ormore types of endurance athletes. In particular, embodiments of theinvention can be used in testing neonates because of the limitedsource of sample (e.g., blood) and the relatively non-invasivenature of the method. Moreover, embodiments of the invention can beused in testing athletes, such as endurance athletes, because therelatively small volume of sample necessary for use in conjunctionwith the method will not significantly impact the blood volume ofthe athlete. In addition, the relatively small volume of sample(e.g., blood) that is required by the method may further enablemultiple samplings of the respective subjects.
[0101] Some embodiments of the invention may also be employed inother contexts. For example, the methodology can be used topotentially assess the sex of a child in utero. Specifically, it isknown that extracellular DNA of a fetus can cross the placenta andenter the circulation of the mother. In the event that the fetus ismale, embodiments of the methodology can be used to detect one ormore RNA transcripts associated with the Y chromosome. Otherembodiments of the invention can be used in conjunction with anyother applications that can accommodate relatively small volumes ofsample and/or require multiple sample acquisition events.
[0102] In one aspect, described herein is an assay comprising:measuring, in a sample obtained from a subject, the level of atleast one miRNA selected from the group consisting of: miR-10b-5p;miR-151b; miR-29b-2-5p; miR-329-3p; miR-6511a-5p; miR-5690;miR-516b-5p; miR208b-3p; miR106a-5p; miR-363-3p; miR-4526;miR-129-1-3p; miR-129-2-3p; miR-132-3p; miR-132-5p; miR127-3p;miR212-3p; miR-1224-5p; miR16-2-3p; miR-1294; miR-30a-3p;miR-132-5p, miR-212-3p, miR-212-5p, miR-145-5p; and miR-29a-5p anddetermining that the subject is at increased risk of Parkinson'sDisease developing or progressing if the level of an miRNA selectedfrom the group consisting of miR-151 b; miR-5690; miR-516b-5p;miR208b-3p; miR106a-5p; and miR-363-3p; miR-30a-3p; and miR-29a-5pis increased relative to a reference, and determining that thesubject is at decreased risk of Parkinson's Disease developing orprogressing if the level of the miRNA is not increased relative toa reference; determining that the subject is at increased risk ofParkinson's Disease developing or progressing if the level of anmiRNA selected from the group consisting of: miR-10b-5p;miR-29b-2-5p; miR-329-3p; miR-6511a-5p; miR-4526; miR-129-1-3p;miR-129-2-3p; and miR-132-3p; miR-132-5p; miR127-3p; miR212-3p;miR-1224-5p; miR16-2-3p; miR-1294 miR-132-5p, miR-212-3p,miR-212-5p, and miR-145-5p; is decreased relative to a reference,and determining that the subject is at decreased risk ofParkinson's Disease developing or progressing if the level of themiRNA is not decreased relative to a reference; wherein increasedrisk of Parkinson's Disease developing or progressing comprisesdeveloping Parkinson's Disease at a younger age; death due toParkinson's Disease at a younger age; development of dementia;development of dementia at an earlier age; or onset of motorsymptoms at an earlier age when compared to other individuals withParkinson's Disease who do not have such a level of the miRNA.
[0103] In one aspect, described herein is a method comprising:measuring, in a sample obtained from a subject, the level of atleast one miRNA selected from the group consisting of: miR-10b-5p;miR-151b; miR-29b-2-5p; miR-329-3p; miR-6511a-5p; miR-5690;miR-516b-5p; miR208b-3p; miR106a-5p; miR-363-3p; miR-4526;miR-129-1-3p; miR-129-2-3p; miR-132-3p; miR-132-5p; miR127-3p;miR212-3p; miR-1224-5p; miR16-2-3p; miR-1294; miR-30a-3p;miR-132-5p, miR-212-3p, miR-212-5p, miR-145-5p; and miR-29a-5p anddetermining that the subject is at increased risk of Parkinson'sDisease developing or progressing if the level of an miRNA selectedfrom the group consisting of; miR-151 b; miR-5690; miR-516b-5p;miR208b-3p; miR106a-5p; and miR-363-3p; miR-30a-3p; and miR-29a-5pis increased relative to a reference, and determining that thesubject is at decreased risk of Parkinson's Disease developing orprogressing if the level of the miRNA is not increased relative toa reference; determining that the subject is at increased risk ofParkinson's Disease developing or progressing if the level of anmiRNA selected from the group consisting of: miR-10b-5p;miR-29b-2-5p; miR-329-3p; miR-6511a-5p; miR-4526; miR-129-1-3p;miR-129-2-3p; and miR-132-3p; miR-132-5p; miR127-3p; miR212-3p;miR-1224-5p; miR16-2-3p; miR-1294; miR-132-5p, miR-212-3p,miR-212-5p, and miR-145-5p is decreased relative to a reference,and determining that the subject is at decreased risk ofParkinson's Disease developing or progressing if the level of themiRNA is not decreased relative to a reference; and administering atreatment for Parkinson's Disease if the subject is at increasedrisk of Parkinson's Disease developing or progressing; whereinincreased risk of Parkinson's Disease developing or progressingcomprises developing Parkinson's Disease at a younger age; deathdue to Parkinson's Disease at a younger age; development ofdementia; development of dementia at an earlier age; or onset ofmotor symptoms at an earlier age when compared to other individualswith Parkinson's Disease who do not have such a level of themiRNA.
[0104] In some embodiments, a treatment for Parkinson's Disease canbe selected from the group consisting of: Levodopa agonists;dopamine agonists; COMT inhibitors; deep brain stimulation; MAO-Binhibitors; lesional surgery; regular physical exercise; regularmental exercise; improvements to the diet; and Lee Silverman voicetreatment. In some embodiments, a treatment for Parkinson's Diseasecan comprise administering an agent that modulates (e.g., increasesor decreases) the abnormal level or expression of at least one ofthe said miRNAs.
[0105] In one aspect, described herein is an assay comprising:measuring, in a sample obtained from a subject, the level of atleast one miRNA selected from the group consisting of: miR-10b-5p;miR196a-5p; miR196b-5p; miR615-3p; and miR1247-5p; miR106a-5p;miR363-3p; miR-129-1-3p and miR-132-3p; and determining that thesubject is at increased likelihood of Huntington's Diseasedeveloping at an earlier age or progressing more rapidly if thelevel of an miRNA selected from the group consisting of:miR-10b-5p; miR196a-5p; miR196b-5p; miR615-3p; miR1247-5p;miR106a-5p; and miR363-3p is increased relative to a reference, anddetermining that the subject is at decreased likelihood ofHuntington's Disease developing at an earlier age or progressingmore rapidly if the level of the miRNA is not increased relative toa reference; or determining that the subject is at increasedlikelihood of Huntington's Disease developing at an earlier age orprogressing more rapidly if the level of an miRNA selected from thegroup consisting of: miR-129-1-3p and miR-132-3p; is decreasedrelative to a reference, and determining that the subject is atdecreased likelihood of Huntington's Disease developing at anearlier age or progressing more rapidly if the level of the miRNAis not decreased relative to a reference; wherein increasedlikelihood of Huntington's Disease developing at an earlier age orprogressing more rapidly comprises developing Huntington's Diseaseat a younger age; death due to Huntington's Disease at a youngerage, and/or becoming more severely disabled at a younger age ascompared to other individuals with Huntington's Disease who do nothave such a level of the miRNA.
[0106] In one aspect, described herein is a method comprising:measuring, in a sample obtained from a subject, the level of atleast one miRNA selected from the group consisting of: miR-10b-5p;miR196a-5p; miR196b-5p; miR615-3p; miR1247-5p; miR106a-5p;miR363-3p; miR-129-1-3p and miR-132-3p; and determining that thesubject is at increased likelihood of Huntington's Diseasedeveloping at an earlier age or progressing more rapidly if thelevel of an miRNA selected from the group consisting of:miR-10b-5p; miR196a-5p; miR196b-5p; miR615-3p; miR1247-5p;miR106a-5p; and miR363-3p is increased relative to a reference, anddetermining that the subject is at decreased likelihood ofHuntington's Disease developing at an earlier age or progressingmore rapidly if the level of the miRNA is not increased relative toa reference; or determining that the subject is at increasedlikelihood of Huntington's Disease developing at an earlier age orprogressing more rapidly if the level of an miRNA selected from thegroup consisting of; miR-129-1-3p and miR-132-3p; is decreasedrelative to a reference, and determining that the subject is atdecreased likelihood of Huntington's disease developing at anearlier age or progressing more rapidly if the level of the miRNAis not decreased relative to a reference; and administering atreatment for Huntington's Disease if the subject is at increasedlikelihood of Huntington's disease developing at an earlier age orprogressing more rapidly wherein increased likelihood ofHuntington's disease developing at an earlier age or progressingmore rapidly comprises developing Huntington's Disease at a youngerage; death due to Huntington's Disease at a younger age, and/orbecoming more severely disabled at a younger age, when compared toother individuals with Huntington's Disease who do not have such alevel of the miRNA.
[0107] In some embodiments, a treatment for Huntington's Diseasecan be selected from the group consisting of: regular physicalexercise; regular mental exercise; improvements to the diet; oradministering creatine monohdrate, coenzyme Q10, sodiumphenylbutyrate. In some embodiments, a treatment for Huntington'sDisease can comprise administering an agent that modulates (e.g.increases or decreases) the abnormal level or expression of atleast one of the miRNAs whose abnormal levels and/or expression isdescribed herein as indicating an increased risk or likelihood ofHuntington's Disease developing or progressing.
[0108] Additional aspects of assaying specific miRNAs that indicateneurodegenerative disease are disclosed in U.S. patent applicationSer. No. 14/595,783, which is hereby incorporated by reference.
[0109] The present invention offers several advantages over othermethods and assays. Isolating RNA with the EXORNEASY.RTM. kitenriches for extracellular RNA biomarkers. These extracellular RNAbiomarkers are rich in information and can be used for variousapplications such as determining fetal sex from a maternal driedblood sample. Moreover, using RNA sequencing instead of amicroarray hybridization technology expands the flexibility andpossible uses of the method: RNA sequencing does not requirespecies- or transcript-specific probes, and it can detect noveltranscripts, gene fusions, single nucleotide variants (SNVs), andindels. RNA sequencing is a digital technology and therefore has abroader dynamic range whereas microarrays are limited by backgroundat the low end and signal saturation at the high end. Theadvantages of RNA sequencing over microarray analysis are furtherexplained in Zhao, S. et al., (2014) Comparison of RNA-Seq andMicroarray in Transcriptome Profiling of Activated T Cells, PLoSONE 9:e78644; and Hrdlickova, R. et al. (2016) RNA-Seq Methods forTranscriptome Analysis, WIREs RNA doi:10.1002/wrna. 1364.
[0110] The gene names listed herein, including the miRNA names, arecommon names. NCBI Gene ID numbers and/or sequences for each of thegenes given herein can be obtained by searching the "Gene" Databaseof the NCBI (available on the World Wide Web athttp://www.ncbi.nlm.nih.gov/) using the common name as the queryand selecting the first returned Homo sapiens gene. Alternatively,sequences for each of the miRNAs given herein can be obtained bysearching the miRbase (available on the world wide web atmirbase.org) using the common name as the query and selecting thefirst returned Homo sapiens miRNA.
[0111] The present invention is further illustrated by thefollowing examples that should not be construed as limiting. Thecontents of all references, patents, and published patentapplications cited throughout this application, as well as theFigures, are incorporated herein by reference in their entirety forall purposes.
EXAMPLES
Example 1. Dried Blood Spot Assay Development
[0112] Materials and Methods
[0113] Sample Collection
[0114] A first set of samples were collected as undried bloodsamples that were added directly to the first buffer listed below(i.e., wet blood drops/droplets). Another set of samples wereobtained as a single drop of a subject's blood on a samplecollection apparatus obtained from FORTIUSBIO.RTM.. Thereafter, theblood spot was allowed to dry on the FORTIUSBIO.RTM. samplecollection apparatus. After drying, the dried blood spot wasremoved from the sample collection apparatus and processed asdescribed below.
[0115] In addition, samples were obtained from a subject before,during, and after vigorous exercise (e.g., biking). In particular,a drop of blood was obtained from the subject at 5:30 AM and 9:30AM (pre-exercise samples) and the subject started exercising at10:00 AM. After initiation of exercise at 10:00 AM, samples wereobtained at ten-minute intervals for during the one hour exercisesession (a total of 6 samples). Then, post-exercise samples wereobtained at noon, 1:00 PM, 2:00 PM, and 3:00 PM. All samples wereobtained using a finger-puncture technique in which a single dropof blood from the subject's finger was applied to a samplecollection apparatus (i.e., RNA collection paper fromFORTIUSBIO.RTM.).
[0116] RNA Extraction
[0117] The investigators extracted RNA using two differentcommercially available kits--the QIAAMP.RTM. RNA Blood Kit fromQIAGEN.RTM. (e.g., for the isolation of total RNA) andEXORNEASY.RTM. Serum/Plasma Kit from QIAGEN.RTM. (e.g., for theisolation of intracellular and/or extracellular RNA). Theinvestigators modified the protocols as described below. Inparticular, the investigators did not perform the initial step oflysing all non-erythrocyte cells.
[0118] The protocol used for the EXORNEASY.RTM. kit is includedbelow in its stage as modified by the investigators:
[0119] Remove a portion of the sample that has been dried on thesample collection apparatus and mix with 0.9% saline. Allow thesample to spin in the saline solution for approximately onehour.
[0120] After spinning/mixing for an hour, the sample collectionapparatus is removed and the process proceeds as follows.
[0121] Add a 1:1 volume of buffer XBP and sample. Immediatelyinvert tube 5 times to mix.
[0122] Place sample/XBP mix onto exoEasy spin column. Spin for 1minute at 500 g. Discard flow-through.
[0123] Add 10 mL XWP and spin at 3,000-5,000 g for 5 minutes.Discard flow-through and collection tube.
[0124] Transfer spin column to a new collection tube.
[0125] Add 700 ul QIAzol to the membrane. Spin at 3,000-5,000 g for5 minutes to collect the lysate and transfer to a 2 mL tube.
[0126] Vortex the lysate briefly and incubate at room temperaturefor 5 minutes.
[0127] Add 90 ul chloroform to the lysate. Cap tube and shake for15 seconds.
[0128] Incubate at room temperature for 3 minutes.
[0129] Centrifuge 15 minutes at 12,000 g at 4.degree. C.
[0130] Transfer upper aqueous phase to a new collection tube. Avoidtransfer of any interphase material.
[0131] Add 2:1 volume of 100% ethanol to sample. Mix thoroughly bypipetting up and down several times.
[0132] Pipette 700 ul sample, including precipitate, if formed,into an RNeasy MinElute spin column in a 2 mL collection tube.Close lid and centrifuge at 9,000.times.g for 15 seconds at roomtemperature. Discard flow-through.
[0133] Repeat step 12 using the remainder of the sample. Discardflow-through.
[0134] Add 700 ul Buffer RWT to RNeasy MinElute spin column. Closelid and centrifuge at 9,000.times.g for 15 seconds. Discardflow-through.
[0135] Pipette 500 ul Buffer RPE onto RNeasy MinElute spin column.Close lid and centrifuge at 9,000.times.g for 15 seconds. Discardflow-through.
[0136] Pipette 500 ul Buffer RPE onto RNeasy MinElute spin column.Close lid and centrifuge at 9,000.times.g for 2 minutes. Discardthe flow-through and collection tube.
[0137] Place spin column into new 2 mL collection tube. Open lid ofspin column and centrifuge at full speed (16,000.times.g) for 5minutes to dry the membrane. Discard the collection tube withflow-through.
[0138] Place RNeasy MinElute spin column in a new 1.5 mL collectiontube. Add 15 ul RNase-free water directly to center of membrane.Close lid and let column stand for 1 minute. Centrifuge for 1minute at full speed (16,000.times.g) to elute RNA. Repeat oncemore for a total volume of 30 ul. This final step has beenoptimized compared to the manufacturer's recommended elutionprocess. As such, this step provides the investigators withimproved elution.
[0139] The protocol used for the QIAAMP.RTM. kit is included belowin its stage as modified by the investigators, with an optionalDNAse treatment included as well:
[0140] Add 10 ul B-Mercaptoethanol (BME) per 1 mL Buffer RLT beforebeginning
[0141] Add Buffer RLT to the sample (i.e., dried blood spot or DBS)(350 ul). Vortex or pipet to mix and allow to rotate at roomtemperature for one hour.
[0142] After spinning/mixing for an hour, the sample collectionapparatus is removed and the process proceeds as follows.
[0143] Pipet lysate directly into a QIAshredder spin column in a 2mL collection tube and centrifuge for 2 minutes at maximum speed(16,000.times.g) to homogenize. Discard the QIAshredder spin columnand save the homogenized lysate.
[0144] Add 1 volume (350 ul) of 70% ethanol to the homogenizedlysate and mix by pipetting. Do not centrifuge.
[0145] Pipet sample, including any participate which may have beenformed, into a new QIAAMP.RTM. spin column in a 2 mL collectiontube. Centrifuge for 15 seconds at 9,000.times.g. Maximum loadingvolume is 700 ul. If the volume of the sample exceeds 700 ul,successively load aliquots onto the QIAAMP.RTM. spin column andcentrifuge as above. Discard flow-through. (Optional on-columnDNase digestion after this step. See below.)
[0146] Transfer the QIAAMP.RTM. spin column into a new 2 mLcollection tube. Pipette 700 ul of Buffer RW1 into the spin columnand centrifuge for 15 seconds at 9,000.times.g to wash. Discardflow-through.
[0147] Place QIAAMP.RTM. spin column in a new 2 mL collection tube.Pipette 500 ul of Buffer RPE into the spin column and centrifugefor 15 seconds at 9,000.times.g. Discard flow-through.
[0148] Carefully open QIAAMP.RTM. spin column and add 500 ul ofBuffer RPE. Close cap and centrifuge at full speed (16,000.times.g)for 3 minutes.
[0149] Place QIAAMP.RTM. spin column in a new 2 mL collection tubeand discard the old collection tube with the filtrate. Centrifugeat full speed (16,000.times.g) for 1 minute.
[0150] Transfer QIAAMP.RTM. spin column into a 1.5 mLmicrocentrifuge tube and pipet 15 ul of RNase-free water directlyonto the QIAAMP.RTM. membrane. Incubate for 1 minute beforecentrifuging for 1 minute at 9,000.times.g to elute the RNA. Repeatonce more for a total volume of 30 ul. This final step has beenoptimized compared to the manufacturer's recommended elutionprocess. As such, this step provides the investigators withimproved elution.
[0151] On-Column DNase
[0152] Add 350 ul Buffer RW1 to the QIAAMP.RTM. spin column. Closelid and centrifuge for 15 seconds at 9,000.times.g to wash themembrane. Discard the flow-through.
[0153] Add 10 ul DNase I Stock solution to 70 ul Buffer RDD. Mix byinverting the tube and centrifuge briefly to collect residualliquid from the top and sides of the tube.
[0154] Add the DNase I incubation mix (80 ul) directly to theQIAAMP.RTM. spin column membrane and place at room temperature for15 minutes.
[0155] Add 350 ul Buffer RW1 to the QIAAMP.RTM. spin column. Closethe lid and centrifuge for 15 seconds at 9,000.times.g. Discardflow-through. Continue with the first Buffer RPE wash step in theprotocol.
[0156] The preceding protocols provide embodiments of the presentinvention. These protocols are capable of further modifications andthese applications are intended to cover any variations, uses, oradaptations of the invention following, in general, the principlesof the invention and including such departures from the presentdisclosure as come within known or customary practice within theart to which the invention pertains.
[0157] Determination of RNA Yield
[0158] Quantification of total RNA yield was determined by Quant-iTRiboGreen RNA reagent (Invitrogen) utilizing the low-range assay ina 200-.mu.L total volume in the 96-well format (Costar). Thisprotocol allows for quantification of 1-50 pg/.mu.L, the linearityof which is maintained in the presence of common post-purificationcontaminants such as salts, ethanol, chloroform, detergents,proteins, and agarose (Jones L J, Yue S T, Cheung C Y, Singer V L.1998. RNA quantitation by fluorescence-based solution assay:RiboGreen reagent characterization. Anal Biochem 265: 368-374.).Individual samples were assayed in triplicate, and the means werecalculated. The three replicates from the same treatment wereaveraged. The investigators used the low-range assay (1-50pg/.mu.L) in a 200-.mu.L total volume of working reagent in a96-well format and read on a plate reader (BioteK Synergy HT).
[0159] In addition, sample quality was assessed using an Agilentbioanalyzer. In particular, RNA quality was determined by capillaryelectrophoresis of the extracted RNA through the use of an AgilentBioanalyzer. As is customary, the RNA quality is quantified as aRIN, wherein the RIN is calculated by an algorithmic assessment ofthe amount of various RNAs presented within the extracted RNA.High-quality cellular RNA generally exhibits a RNA valueapproaching 10.
[0160] RNA Sequencing
[0161] A portion of the extracted nucleic acids was introduced intothe TruSeq Small RNA Sample reagents, followed by 15 cycles of PCRto amplify the library. The investigators clustered a single readv3 flow cell and performed RNA deep sequencing on the HiSeq 2000using the RNA isolated from the aliquots of sample.
[0162] Sequencing Data Analysis
[0163] CV values for respective RNAs were calculated. Bycalculating the CV, the inventors were able to gather informationrelated to each RNA's stability during the drying process anddetermine each RNA's potential accuracy as a biomarker. The CVvalue for each RNA is then employed in a two-step filteringprocess.
[0164] The first step of this analytical process includes thecreation of a database of the CV values for one or more of the RNAsobtained from the samples. The database includes the CV valuesusing non-dried blood spot RNA sequencing data to interrogate adozen technical replicates of a control RNA sample. In someaspects, the control RNA sample can be from a known cell type, suchas the HEK cell line. This interrogation process allows theinvestigators to filter subject (e.g., human) RNA transcriptsexhibiting high variance likely due only to technical reasonsbecause the replicates are technical in nature, rather thanbiological.
[0165] The second step of this filtering approach includes creatingrelatively specific CV databases for each sample type and methodsof extraction, library preparation, sequencing methodology, etc.The information in these specific CV databases can be used tofilter sample-/project-specific technical variance so that the bestRNAs can be selected as markers for medical purposes. Moreover,some aspects of this two-step analytical methodology can beemployed with other RNA sequencing-based methodologies.
[0166] Results
[0167] As an initial matter, after extraction from the samples, thequality of the RNA was assessed using an Agilent Bioanalyzer, asillustrated in FIGS. 5-14. First, in FIGS. 5 and 6 and Table 1, theinvestigators were able to demonstrate the relative quality andquantity of the RNA molecules isolated by the QIAAMP.RTM. andEXORNEASY.RTM. kits with the addition of only a drop of wet wholeblood to the first step of the extraction process (i.e., ratherthan using a dried blood spot), as a control. The standard curveassociated with Table 1 is shown in FIG. 1. The data shows that theinvestigators were able to isolate acceptable quality RNAs (FIGS. 5and 6) and of a sufficient concentration (Table 1). Next, theinventors investigated the isolation of RNA using the QIAAMP.RTM.kit and dried blood spots that had been previously dried on asample collection apparatus. The inventors performed theextractions with and without the DNA digestion step using DNase.The data shows that the investigators were able to isolateacceptable quality RNAs (FIGS. 7 and 8) and of a sufficientconcentration (Table 2). The standard curve associated with Table 2is shown in FIG. 2.
[0168] The inventors also compared the quality and quantity of theRNAs obtained using the QIAAMP.RTM. and EXORNEASY.RTM. kits, asshown in FIGS. 9 and 10 and Table 3. Using a dried blood spotisolated from a sample collection apparatus and processed using theabove-described protocols, the investigators were able to isolateacceptable quality RNAs (FIGS. 9 and 10) and of a sufficientconcentration (Table 3). The standard curve associated with Table 3is shown in FIG. 3.
[0169] Finally, as illustrated in FIGS. 11-14 and Table 4, theinventors also conducted an investigation of the feasibility ofcollecting many samples from a subject who has been exercising.Using a dried blood spots isolated from one or more samplecollection apparatuses and processed using the above-describedprotocols, the investigators were able to isolate acceptablequality RNAs (FIGS. 11-14) and of a sufficient concentration (Table4). The standard curve associated with Table 4 is shown in FIG.4.
TABLE-US-00001 TABLE 1 Total [Concen- Std CV Volume Total Well IDName Well Conc/Dil 485,528 tration] Count Mean Dev (%) (ul) YieldAMP A1 11112 1.476 3 1.484 0.032 2.125 30 44.52 WBD 1 B1 109691.457 C1 11426 1.519 AMP A2 13817 1.841 3 1.868 0.051 2.723 3056.04 WBD 2 B2 13777 1.836 C2 14449 1.927 AMP A3 21277 2.848 32.803 0.098 3.509 30 84.09 WBD 3 B3 20108 2.69 C3 21447 2.871 AMPA4 11637 1.547 3 1.493 0.049 3.285 30 44.79 WBD 4 B4 11134 1.479 C410931 1.452 EXO A5 60185 8.098 3 7.847 0.235 3 30 235.41 WBD 1 B556725 7.631 C5 58067 7.812 EXO A6 49805 6.697 3 6.55 0.149 2.268 30196.5 WBD 2 B6 48730 6.552 C6 47603 6.4 EXO A7 31340 4.206 3 4.230.039 0.933 30 126.9 WBD 3 B7 31857 4.275 C7 31362 4.209 EXO A863197 8.504 3 8.483 0.216 2.543 30 254.49 WBD 4 B8 61366 8.257 C864552 8.687 STD1 H10 10 75678 10.188 3 10.113 0.072 0.715 H11 1074608 10.044 H12 10 75081 10.107 STD2 H7 5 35661 4.789 3 4.8590.154 3.18 H8 5 35384 4.751 H9 5 37491 5.036 STD3 H4 2.5 179472.399 3 2.461 0.059 2.391 H5 2.5 18462 2.468 H6 2.5 18814 2.516STD4 H1 1.25 3485 0.447 3 0.931 0.42 45.153 H2 1.25 8608 1.139 H31.25 9120 1.208 STD5 G10 0.625 4911 0.64 3 0.696 0.06 8.655 G110.625 5274 0.689 G12 0.625 5799 0.76 STD6 G7 0.3125 3323 0.425 30.36 0.057 15.764 G8 0.3125 2554 0.322 G9 0.3125 2642 0.334 STD7 G40.15625 2147 0.267 3 0.247 0.019 7.727 G5 0.15625 1981 0.244 G60.15625 1866 0.229 STD8 G1 0 1487 0.178 3 0.177 0.006 3.504 G2 01529 0.183 G3 0 1437 0.171
TABLE-US-00002 TABLE 2 Total [Concen- Std CV Volume Total Well IDName Well Conc/Dil 485,528 tration] Count Mean Dev (%) (ul) Yield1+ with A1 3754 0.698 3 0.685 0.012 1.715 30 20.55 DNAase AMP-DNaseB1 3578 0.674 C1 3655 0.684 2+ A2 3510 0.665 3 0.668 0.01 1.549 3020.04 AMP-DNase B2 3617 0.679 C2 3466 0.659 3+ A3 2391 0.516 30.506 0.01 1.892 30 15.18 AMP-DNase B3 2301 0.504 C3 2249 0.497 4+A4 2836 0.575 3 0.569 0.013 2.355 30 17.07 AMP-DNase B4 2853 0.578C4 2671 0.553 1- without A5 4560 0.805 3 0.795 0.009 1.15 30 23.85DNAse AMP-DNase B5 4471 0.793 C5 4425 0.787 2- A6 6412 1.052 31.008 0.044 4.336 30 30.24 AMP-DNase B6 5756 0.964 C6 6079 1.007 3-A7 3870 0.713 3 0.695 0.016 2.234 30 20.85 AMP-DNase B7 3676 0.687C7 3661 0.685 4- A8 4394 0.783 3 0.77 0.011 1.384 30 23.1 AMP-DNaseB8 4252 0.764 C8 4259 0.765 STD1 H10 10 76090 10.333 2 9.914 0.5925.971 H11 10 78987 >10.500 H12 10 69805 9.496 STD2 H7 5 343204.769 3 4.897 0.198 4.048 H8 5 34523 4.796 H9 5 36993 5.125 STD3 H42.5 14887 2.18 3 2.179 0.076 3.506 H5 2.5 14301 2.102 H6 2.5 154482.255 STD4 H1 1.25 1595 0.41 3 0.815 0.351 43.047 H2 1.25 60861.008 H3 1.25 6223 1.026 STD5 G10 0.625 2658 0.552 3 0.561 0.0152.69 G11 0.625 2668 0.553 G12 0.625 2859 0.578 STD6 G7 0.3125 24830.528 3 0.445 0.073 16.452 G8 0.3125 1458 0.392 G9 0.3125 16280.414 STD7 G4 0.15625 1463 0.392 3 0.384 0.014 3.591 G5 0.156251464 0.392 G6 0.15625 1284 0.369 STD8 G1 0 1431 0.388 3 0.38 0.0082.034 G2 0 1369 0.38 G3 0 1315 0.373
TABLE-US-00003 TABLE 3 Total Total [Concen- Std CV Volume YieldWell ID Name Well Conc/Dil 485,528 tration] Count Mean Dev (%) (ul)(ng) AMP 1 A1 3016 0.441 3 0.449 0.013 2.813 30 13.47 B1 3028 0.442C1 3187 0.463 AMP 2 A2 3979 0.568 3 0.603 0.035 5.725 30 18.09 B24249 0.604 C2 4501 0.637 AMP 3 A3 3478 0.502 3 0.515 0.016 3.136 3015.45 B3 3545 0.511 C3 3715 0.533 AMP 4 A4 3400 0.491 3 0.501 0.0326.444 30 15.03 B4 3267 0.474 C4 3740 0.536 EXO 1 A5 2094 0.319 30.326 0.011 3.308 30 9.78 B5 2111 0.321 C5 2243 0.338 EXO 2 A6 22880.344 3 0.342 0.005 1.392 30 10.26 B6 2230 0.337 C6 2296 0.345 EXO3 A7 2626 0.389 3 0.394 0.018 4.655 30 11.82 B7 2551 0.379 C7 28200.415 EXO 4 A8 3300 0.478 3 0.482 0.008 1.703 30 14.46 B8 33960.491 C8 3280 0.476 STD1 H10 10 72917 9.684 3 9.896 0.252 2.551 H1110 74010 9.828 H12 10 76632 10.175 STD2 H7 5 38571 5.142 3 5.2810.225 4.255 H8 5 38707 5.16 H9 5 41580 5.54 STD3 H4 2.5 18004 2.4233 2.46 0.167 6.791 H5 2.5 17188 2.315 H6 2.5 19667 2.642 STD4 H11.25 10277 1.401 3 1.176 0.194 16.53 H2 1.25 7785 1.071 H3 1.257678 1.057 STD5 G10 0.625 2564 0.381 3 0.386 0.054 13.929 G11 0.6252212 0.334 G12 0.625 3022 0.441 STD6 G7 0.3125 1319 0.216 3 0.2210.019 8.383 G8 0.3125 1511 0.242 G9 0.3125 1238 0.206 STD7 G40.15625 1361 0.222 3 0.216 0.012 5.405 G5 0.15625 1214 0.202 G60.15625 1372 0.223 STD8 G1 0 1262 0.209 3 0.208 0.007 3.557 G2 01314 0.216 G3 0 1202 0.201
TABLE-US-00004 TABLE 4 Total [Concen- Std CV Volume Total Well IDName Well Conc/Dil 485,528 tration] Count Mean Dev (%) (ul) Yield530 PRE A1 4315 0.432 3 0.435 0.004 0.957 30 13.05 B1 4372 0.44 C14324 0.433 930 PRE A2 6140 0.68 3 0.677 0.002 0.364 30 20.31 B26105 0.675 C2 6114 0.676 10M A3 4798 0.497 3 0.505 0.009 1.805 3015.15 B3 4930 0.515 C3 4842 0.503 20M A4 4185 0.414 3 0.423 0.0173.976 30 12.69 B4 4174 0.413 C4 4394 0.443 30M A5 4985 0.523 30.523 0.008 1.506 30 15.69 B5 5043 0.531 C5 4927 0.515 40M A6 52570.56 3 0.52 0.035 6.66 30 15.6 B6 4797 0.497 C6 4836 0.503 50M A73975 0.386 3 0.379 0.01 2.653 30 11.37 B7 3838 0.367 C7 3955 0.38360M A8 4235 0.421 3 0.386 0.043 11.189 30 11.58 B8 4072 0.399 C83621 0.338 1 HR POST A9 4630 0.475 3 0.47 0.023 4.84 30 14.1 B94411 0.445 C9 4740 0.49 2 HR POST A10 4045 0.395 3 0.379 0.0164.103 30 11.37 B10 3906 0.376 C10 3818 0.364 3 HR POST A11 37070.349 3 0.381 0.028 7.362 30 11.43 B11 4095 0.402 C11 4025 0.393 4HR POST A12 4221 0.419 3 0.414 0.01 2.481 30 12.42 B12 4098 0.402C12 4236 0.421 STD1 H10 10 73297 9.796 3 9.932 0.251 2.526 H11 1073170 9.779 H12 10 76433 10.222 STD2 H7 5 38293 5.044 3 5.031 0.0861.711 H8 5 38777 5.11 H9 5 37520 4.939 STD3 H4 2.5 21911 2.821 32.659 0.182 6.842 H5 2.5 20973 2.693 H6 2.5 19268 2.462 STD4 H11.25 11571 1.417 3 1.425 0.015 1.061 H2 1.25 11558 1.415 H3 1.2511757 1.442 STD5 G10 0.625 4825 0.501 3 0.537 0.037 6.824 G11 0.6255084 0.536 G12 0.625 5365 0.574 STD6 G7 0.3125 2433 0.176 3 0.2150.035 16.155 G8 0.3125 2798 0.226 G9 0.3125 2927 0.243 STD7 G40.15625 1809 0.092 3 0.092 0.001 0.592 G5 0.15625 1813 0.092 G60.15625 1805 0.091 STD8 G1 0 788 <0.000 0 -- -- -- G2 0 809<0.000 G3 0 752 <0.000
Example 2. Novel Informatics Approach for the Analysis of DriedBlood Spot RNA-Seq
[0170] Initial steps in the analysis of dried blood spot (DBS)RNA-seq are similar to standard RNA-seq analysis pipelines. The rawsequencing reads must be trimmed of adapters (Cutadapt,AlienTrimmer), aligned to a reference (STAR, Tophat), andquantitated (FeatureCounts, htseq-count) to generate numericalestimates of each gene's expression, or "counts". The principleissue is recognizing variance in these counts due to technicalreasons that do not represent biological significance.
[0171] In standard (non-DBS) RNA-seq experiments, this is addressedwith a simple expression cutoff (i.e., any gene detected with >ncounts usually exhibits low variance among replicates). This is notsuitable for DBS samples, as the process of drying imparts anon-uniform effect across transcripts of variable length,presumably due to their biochemical structure and stability. Tosummarize, the process of drying RNA results in "messy" or "noisy"data.
[0172] To control for this, the stability of different driedtranscripts is surveyed by sequencing technical replicates andcalculating the coefficient of variance for each transcript ondifferent collections mediums (i.e., the "CV DBS"). This gives usan idea of each transcript's stability during drying and potentialaccuracy as a biomarker when collected on a particular paper. Theinvestigators utilize this data in a two-step filteringapproach.
[0173] First, the investigators have created a database oftranscript coefficients of variance by using standard (non-DBS)RNA-seq to survey a dozen technical replicates of control HEK RNA(i.e., a database of "CV Standard" values). Because they aretechnical replicates (biologically identical), this informationallows us to filter human transcripts exhibiting high variance duesolely to technical reasons.
[0174] Secondly, the investigators have created highly specificcoefficient of variance databases for each sample type andpreparation method (i.e., databases of "CV DBS" values). Thisinformation can be used to filter project-specific technicalvariance and identify good transcriptional biomarkers.Representative biomarkers with good potential (i.e., "CV DBS"values and "CV Standard" values that are relatively low) and withpoor potential (i.e., "CV DBS" values and/or "CV Standard" valuesthat are relatively high) are shown in Table 5.
[0175] This two-step CV filtering approach represents a novel andnecessary step in the analysis of DBS RNA-seq analysis.Furthermore, as the investigators have observed large amounts ofvariance in several highly detected transcripts from standardRNA-seq, our CV filter approach may also be useful for typicalRNA-seq sample types and techniques.
TABLE-US-00005 TABLE 5 Biomarker EnsemblID Gene CV DBS CV StandardPotential ENSG00000183508 FAM46C 0.060717822 0.087871342 goodENSG00000114166 KAT2B 0.081479834 0.085560905 good ENSG00000122026RPL21 0.081957896 0.07062178 good ENSG00000136732 GYPC 0.0829071560.144888107 good ENSG00000140264 SERF2 0.092065358 0.147809395 goodENSG00000006468 ETV1 3.464101615 0.09928696 poor ENSG00000125997BPIFB9P 3.464101615 0.466427796 poor ENSG00000264573 RN7SL15P3.464101615 0.334895213 poor ENSG00000269959 SPACA6P-AS 3.4641016150.170486395 poor ENSG00000059573 ALDH18A1 3.464101615 0.010980315poor
Example 3. Evaluation of Performance of Several Sample CollectionApparatuses
[0176] Several sample collection apparatuses were evaluated fortheir efficiency in stabilizing RNA in dried blood samples.FORTIUSBIO.RTM. RNASOUND.TM. blood sampling cards contain aproprietary solution that lyses cells and releases RNA that isstabilized on the card for at least one week at room temperature.WHATMAN.RTM. FTA.RTM. non-indicating Elute Micro blood cardscontain a lysis buffer consisting of EDTA, Tris, sodium dodecylsulfate (SDS), and uric acid to lyse and stabilize DNA in thesample. WHATMAN.RTM. 903 Protein Saver cards are an untreatedcellulose paper for blood sampling. Multiple, equivalent driedblood samples were collected with each of the three cards and theRNA in the samples was analyzed as described in Example 1.
[0177] Of the three cards tested, the WHATMAN.RTM. 903 ProteinSaver cards out-performed the other cards in RNA recovery, geneexpression profiling, and reproducibility. Surprisingly, the cardwithout any added material designed to stabilize nucleic acid(i.e., untreated cellulose paper) performed the best.
Example 4. Identification of Biomarkers of Aerobic Exercise
[0178] Potential biomarkers of aerobic exercise (cycling) wereidentified with the DBS technology disclosed herein. Samples werecollected at 5 am and 9 am (pre-exercise), at ten minute intervalsduring exercise, and hourly post-exercise. DYSF (dysferlin;Ensembl:ENSG00000135636; NCBI Gene ID: 8291) and MMP9 (matrixmetallopeptidase 9; Ensembl:ENSG00000100985; NCBI Gene ID: 4318)exhibited increased expression after an hour of cycling, peaked onehour post-exercise, and gradually decreased afterwards (see FIG.16).
Example 5. Fetal Sex Determination with DBS Analysis of Cell-FreeRNA in Maternal Plasma
[0179] Fetal sex may be determined with the DBS methods disclosedherein. In the plot presented in FIG. 17, cell-free RNA in maternalplasma was analyzed for the expression of biomarkers specific tothe X chromosome or to the Y chromosome. The data in the plotdemonstrate a clear differentiation between male and female fetuseswhen using cell-free RNA-seq of maternal plasma. Samples with highcounts of RNA specific to the Y chromosome identify male fetuses.The DBS methods disclosed herein offer a simpler, morecost-effective, and safer means of determining fetal sex than doother assays currently available.
Example 6. Identification of Biomarkers Correlating with Onset ofMigraine
[0180] Two different time series of dried blood sample collectionand analysis were conducted with a human subject where samples weredrawn before, during, and after onset of a migraine. Threebiomarkers were identified with the DBS method that correlate withthe onset of migraine. Each exhibits a different expression patternwhich may be indicative of roles in transcriptional pathways (seeFIG. 18). The biomarkers are: [0181] ABCC1 (ATP binding cassettesubfamily C member 1; Ensembl:ENSG00000103222; NCBI Gene ID: 4363);[0182] STXBP3 (syntaxin binding protein 3; Ensembl:ENSG00000116266; NCBI Gene ID: 6814); and [0183] ZDHHC7 (zinc fingerDHHC-type containing 7; Ensembl:ENSG00000153786; NCBI Gene ID:55625).
[0184] Unless defined otherwise, all technical and scientific termsherein have the same meaning as commonly understood by one ofordinary skill in the art to which this invention belongs. Althoughany methods and materials, similar or equivalent to those describedherein, can be used in the practice or testing of the presentinvention, the preferred methods and materials are describedherein. All publications, patents, and patent publications citedare incorporated by reference herein in their entirety for allpurposes.
[0185] The publications discussed herein are provided solely fortheir disclosure prior to the filing date of the presentapplication. Nothing herein is to be construed as an admission thatthe present invention is not entitled to antedate such publicationby virtue of prior invention.
[0186] It should be understood from the foregoing that, whileparticular embodiments have been illustrated and described, variousmodifications can be made thereto without departing from the spiritand scope of the invention as will be apparent to those skilled inthe art. Such changes and modifications are within the scope andteachings of this invention as defined in the claims appendedhereto.
* * * * *