Home

Troubleshooting Your Data

image

Contents

1. ACAGA TGTAC TTCTeacce A GA TTCACH 150 160 170 180 190 200 210 220 20 240 250 260 AA IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca Troubleshooting Your Data This document was taken on the Roswell Park Institute website http www roswellpark org The two most common causes for failure to get good or any sequence data for your samples are purity and concentration of your template DNA If it appears that you have done everything correctly then look below for some additional reasons why you might obtain less than optimal DNA sequence data quality We ve listed various causes solutions and for some pictorial representations of what these specific problems might look like Many causes and solutions may look rather obvious and just involve common sense but you d be surprised how many times we ve heard how could have done that 1 No sequence data TAN ACNC TAG GOGNGMOC NAAT ACTO CTTA NNO ONT CCNNGCOC ASA 0 H l ra 3 Cause not enough or no DNA primer in tube Solutions doublecheck your quantitations stock concentrations and dilutions While our sequencers are very sensitive and can detect a range of DNA concentrations there is still a threshold amount that must be reached to obtain any sequence data Cause inhibitory contaminant Solutions the cycle sequencing reaction used to amplify samples for automated sequencing is very sensitive to th
2. chloroform to remove proteins and other cellular contaminants from cell lysates Phenol cannot 11 IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca be tolerated in the cycle sequencing reaction as it denatures proteins and will thus degrade the Taq polymerase enzyme used in the cycle sequencing reaction Chloroform does not have the strong denaturing properties of phenol and doesn t appear to adversely affect the sequencing reaction EDTA EDTA can chelate the magnesium required by the Tag polymerase in the cycle sequencing reaction so when submitting samples it is best to always have them diluted or resuspended in sterile ddH20 or 1X Tris buffer Suspension in TE buffer is not recommended though people have done it and many times there is not a problem However providing template DNA in water is an easy thing to do and if there is a problem with your sequence quality the fact that there is no EDTA in your sample is one potential problem we can eliminate right away Finally always check your chromatogram to be sure that the base caller did call true bases and not background peaks This document was taken on the Roswell Park Institute website http www roswellpark org 12
3. as a base identification when there are two or more peaks present at one position This N may signify the legitimate occurrence of two nucleotides as in the case of a heterozygote but may also be seen when background noise is high or when multiple products are present When your sample exhibits weak signal the software attempts to compensate by boosting up the signal of sample bands to detectable levels However the background noise will also be artificially amplified giving a poor signal to noise ratio Background noise appears as many smaller undefined peaks under your sequence peaks of interest This noise is always present but with well prepared samples of good signal strength it will be undetectable To determine if your noisy data may be due to weak signal look at your ABI trace file If you are looking at a paper chromatogram look towards the top and middle of your trace for a line that says Signal If the file is on your computer click the A radio button in the bottom left hand corner which is visible when you have opened up the trace file within a viewing program such as EditView or Chromas Scroll down to the line that says Signal and you will see the four nucleotides followed by numbers in parentheses These numbers represent the average signal strength of each nucleotide and their values should optimally be between 200 400 If they are much less than 100 then you can assume your noisy data is at least partially due to its w
4. efficiently but can still anneal and extend and give rise to less intense fragments that can be seen underneath your peaks of interest In both cases its necessary to screen both your vector and insert carefully to look for sequences that may match or be similar to your proposed primer You may need to choose another vector primer on the same end of the multiple cloning site or redesign your custom primer When choosing another primer is difficult such as when primer walking through a repetitive area try to find a primer that has a 3 base match specific to your area of interest which can help act as an anchor Cause multiple priming sites in PCR Solution this may occur when one or both of the PCR primers hybridizes to more than one position on the template DNA giving rise to multiple PCR products Often this will be obvious when visualizing the PCR products on an agarose gel as there will be more than one band present In this case gel purification of the desired product will be necessary One can run into difficulty however when the products are very similar in size which may arise when amplifying related or repetitive DNA and do not separate well on the gel In this case optimization of the PCR reaction may be necessary or redesign of the PCR primers in order to choose a more specific priming site Cause PCR primers acting as both forward and reverse Solution sometimes a PCR product may be generated when one primer functions as b
5. primer is quite long both factors that can increase the potential for primer secondary structure formation If possible choose another primer with a lower Tm Cause primers with n 1 population ISAL 1 DE 1 amp 0 ITGGGGGCAAAS NGGGGNCAANT TNGGNNTTTGNTWATCCNAAAAAAAAC 120 k 150 3 14a 15 Loe Solution this problem is not uncommon and can result from poor quality synthesis of sequencing primers Primers are synthesized from the 3 end to the 5 end and when synthesis is inefficient there can be a significant population of less than full length primers n 1s which are full length primers minus one base plus other shorter derivatives These primers have a common 3 end but different 5 ends thus chains that terminate at the same position will have different lengths and will run at different positions on the gel Primers that have degraded from the 3 end will also give this appearance It is easy to spot this problem within the sequencing chromatogram as each position will contain the true peak as well as the peak immediately to the right of it giving the appearance of shadow peaks Whatever the cause of the n 1s it will be necessary to resynthesize the primer to obtain an oligo of suitable quality for sequencing When high quality reagents and proper protocols are utilized during oligo synthesis cartridge or HPLC purification of the primers is usually not necessary for typical oligos lt 30 bp but sometimes additional purifi
6. A proteins polysaccharides or chromosomal DNA DNA that has degraded while in storage silica fines that carryover from template preparation kits that utilize loose resin or silica solutions Contaminants Salts the processivity of the Taq polymerase used in the cycle sequencing reaction declines in the presence of high amounts of salts Salt contamination in DNA preps may result from coprecipitation of salts in alcohol precipitations insufficient removal of supernatant after precipitations or an incomplete wash of the pellet with 70 ethanol Careful technique should be used when precipitating with alcohol It has also been demonstrated that acetate ions as opposed to sodium potassium or chloride ions are the most inhibitory in sequencing reactions When using potassium acetate or sodium acetate concentrations over 20 mM led to complete failure of the sequencing reactions while concentrations of 60mM of sodium chloride were required before complete inhibition Ethanol ethanol contamination can occur when the sample is insufficiently dried after precipitation or when carried over in an ethanol containing wash buffer used in some DNA isolation procedures Contamination with 10 or greater concentrations of ethanol usually leads to failure of the DNA sequencing reaction Complete drying of the DNA samples is required to remove these traces of ethanol Phenol phenol may be carried over from DNA alkaline lysis methods that utilize phenol and
7. IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca Sequencing troubleshooting If you have problems with your results this is for you You ve just received your results and you are all excited You open the text file and surprise there is full of N in you sequence or worst there is only 5 of them and that s all All that work for useless results Before calling us for a re run please take 2 minutes and check your chomatogram It might help you figure out why your reaction failed lf you recognize your chromatogram pattern in one of this figure take this document and read it You can always e mail us to discuss these features and kindly ask for a re run if you strongly think the failure was our fault 1 No sequence data 2 1 Noisy data from the beginning Alo vA eeu Leal nw fi sla MI Wal soll N N NNN NN NI Hig NN N ae NN NNA NTN py IN NNN NT I ats N NNNC GN N NN NN NTN a CCN NN ele NTAA aide GTACTAC ah i apie 2 2 Noisy data from farther in the sequence vot tlt AANA NANA TIPTAN Wal Sn ee eei en a e rae a a ia i fo ANTNG se CCCCce 100 11 120 3 Homopolymeric regions A Nnna aal ee a i iai Miti anne nn a ii i i k a a nen 4 Truncated sequenpes Bp l AAT I ill f f All ll il nv Nl WANNA hyn mi vil ANI NWA li ATTACTTCTTCAGGTTAAC CCAACAGAAGGCTCGAG AAGGTATATTGCTGTTG FACAGTGAGCGAAGTGGTGAATC TAGGTCA GAA GTAGTGAAGCC
8. an be difficult to sequence through accurately Short stretches of homopolymeric regions are generally not difficult to get through but longer sections can be challenging Sequence data up to and including the polynucleotide region may be fine but the last base of the poly region and all peaks following it may show a wave like stuttering pattern of double peaks that cannot be interpreted This tends to be more problematic in PCR products but can also occur when sequencing plasmids especially when trying to sequence the polyA region of cDNA This difficulty is thought to arise due to enzyme slippage when the growing strand does not stay paired correctly with the template DNA during polymerization through the homopolymer region thus giving rise to fragments of varying lengths that have the same sequence after this area When sequencing cloned DNA with a homopolymer region several options can be tried Sequencing the opposite strand can sometimes be more successful especially when going through a polyG region as the polyC strand is often easier to get through Sometimes designing a new primer that is closer to the homopolymeric region can help as nucleotide concentration and enzyme activity will be in a more optimal range when extending the smaller fragments in the cycle sequencing reaction When trying to sequence PCR products with homopolymeric regions it may sometimes be necessary to clone the PCR product in order to read through the repetitive str
9. avoid long incubations at higher temperatures as substantial amounts of DNA will be degraded in this process Cause trend in worsening data Solutions if you have previously been able to obtain good sequence data but begin to see a deterioration in quality that gets progressively worse you may have some contamination in one or more reagents or have some reagents that have reached the end of their usefulness Make up fresh stocks of commonly used reagents such as buffers and always use high quality distilled water in your preparations Cause inefficient primer binding low Tm degenerate primers mismatch Solutions the Tm of a primer is defined as the temperature at which 50 of the oligonucleotide and its perfect complement are in duplex The Tm of an oligo can be roughly calculated by using the formula Tm 2 C A T 4 C G C This is the most commonly used formula for calculating Tm though it is not the most accurate as it does not factor in salt or formamide concentrations A good website to check out if you are interested in some detailed theory behind Tm calculations is http Awww siqma genosys com oligo _meltingtemp asp In our cycle sequencing reaction our primer template annealing step occurs at 50 C Thus if your primer Tm is much lower than 50 C hybridization to its complementary template will be much less efficient and a lesser number of extending fragments will be generated Increase your primer Tm by adding addit
10. cation can be beneficial IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca 2 2 Noisy data Begin farther into the sequence Cause mixed plasmid prep FT F pAC TG GGGCTNCG S Solution a plasmid prep that is contaminated by more than one product such as two vectors with different inserts or vector with insert and vector without will generally show an early section of clean sequence data common vector multiple cloning site sequence followed by double peaks Occasionally a plasmid may contain more than one vector molecule or may encounter spontaneous deletions or insertions during growth The point at which the double peaks begin corresponds to the start of the insert cloning site To avoid this problem its important to carefully pick a single colony from your growth plate restreaking if necessary to be sure that your colony is completely clonal You should follow this up with a restriction digest of your plasmid run out on an agarose gel to ensure vector and insert are present as expected IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca 3 Homopolymeric regions Cause homopolymeric regions ire eae eee ore ee AA al eerste A AERA sila J 39 iy M Lo AW J J m VN IK qn I ill a p a T Solution regions that contain long stretches of a single nucleotide c
11. e presence of certain contaminants some of which will completely inhibit our sequencing enzyme Please check the Contaminant section for a list of potential inhibitors You may need to reprep your sample to sufficiently remove one or more inhibitory components to obtain any sequence data Cause priming site not present Solutions if you ve chosen one of the sequencing facility s vector primers T7 T7 T3 SP6 IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca M13 M13 make sure it is present in your vector While many of the primers we provide are quite common to many different vectors Doublecheck your plasmid maps sequences if you ve designed your own custom primer from previous sequence data make sure you were using a reliable area of sequence look for sharp well defined peaks with no ambiguity Avoid areas where the peaks are broader and not well separated this will occur towards the end of the sequence where the fragments are larger and the polymer cannot adequately resolve single nucleotides causing inaccurate basecalling Cause expired reagents Solutions falls under common sense category 2 Noisy data with weak signal POC GGAGNANG TGOTHTACMATCTAAGACNTGTGOCCCTCCCTGGTINTGCOT i 310 32 339 340 Noisy data can be identified by the presence of multiple peaks and numerous N s within your sequence The Sequencing Analysis program assigns an N
12. eak signal IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca Cause not enough DNA Solutions doublecheck your quantitations stock concentrations calculations and dilutions Make sure you ve provided the appropriate amount of DNA and or primer Cause inhibitory contaminant e g salts phenol Solutions the cycle sequencing reaction used to amplify samples for automated sequencing is very sensitive to the presence of certain contaminants some of which can partially or completely inhibit our Sequencing enzyme You may need to re purify your sample to sufficiently remove one or more inhibitory components to obtain better sequence data Cause degraded DNA from nucleases repeated freeze thaw excessive UV light exposure bisulfite treatment Solutions Nuclease contamination in a template preparation as well as repeated freeze thaw cycles can degrade DNA over time Even low amounts of nucleases can extensively degrade DNA depending on storage conditions and temperatures as well as the length of time the DNA is stored Generally re isolation and purification of the template DNA will be necessary to obtain good DNA sequence When extracting PCR products from a gel prolonged exposure to UV light will degrade and nick the DNA Limit the time and UV intensity as much as possible to prevent degradation When treating DNA with bisulfite for methylation experiments it is important to
13. etch Cause compression Solution compressions can sometimes be observed when a region of secondary structure forms in the amplified strand of DNA leading to an alteration in the electrophoretic mobility of the DNA strand This can appear as overlapping fragments after a certain point and can resemble a contaminated plasmid prep but the contaminated prep will show double peaks beginning at the insertion site To relax this compression we can sometimes alter cycle sequencing conditions or use additives to denature the secondary structure Alternatively you can linearize your DNA or use 7 deaza dGTP in a PCR reaction to help relieve the compression Cause frame shift mutation Solution a frame shift mutation can occur when one or more bases are inserted or deleted into the template DNA and if multiple products are present in your sample whether it be plasmid DNA or PCR product you will see clean sequence up to the point of the mutation followed by double peaks caused by the shift in the nucleotide sequence In the case of plasmid DNA it will be IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca necessary to re isolate your DNA to get a pure clone containing only one of the molecules With PCR products you will need to gel purify the two products in order to separate them 4 Truncated sequences Truncated sequences can be characterized as abrupt or gradual Abrupt truncati
14. ional bases to the 5 or 3 end to raise the Tm to be within the range of 52 C 58 C Degenerate primers and those with mismatched bases will also show decreased hybridization efficiency due to reduction of the stability of primer binding and if degeneracy or mismatches occur at or near the 3 end of your primer it is highly likely that your sequencing attempt will fail IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca 2 1 Noisy data From the beginning 5 ee eee a OS s aa N S NAN TGOGGNC CT NTANGANG CANG CNTEOAGTCOGNTCOTTNNTOTOATOG 4 18 20 38 58 di Cause multiple priming sites involving vectors Solution your primer may have a secondary hybridization site that may be identical or closely related with different nucleotide sequences following each site giving superimposed bands within your sequence If the priming sites are identical such as when more than one T7 promoter site is present for example the double peaks will be strong from the outset The fragments may also show shifted migration so that the double peaks are not directly on top of one another but will be offset to one side or the other due to the differing mobility patterns of the strands with dissimilar nucleotide composition In other instances a secondary priming site may not be exactly the same but may differ by a few internal bases In this case the mismatched primer may not hybridize as
15. low us to read through it Placing a primer as close to the hairpin loop as possible to help force its unwinding has also worked in the past Sequencing the opposite strand can sometimes lead to a huge improvement If these solutions don t work we may suggest you try linearizing your DNA with restriction enzymes to help relax the hairpin And if you are trying to PCR up avery G C rich region addition of betaine or DMSO to your PCR reaction can help as can substitution of 7 deaza dGTP for 75 of the dGTP in your PCR reaction And if all else fails you can try manual radioactive sequencing as a last resort Cause linearized DNA Solution if your DNA has been cut with one or more restriction enzymes the sequence data will sharply end at the recognition site of the enzyme that cut at the 3 end of your insert Did you IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca accidentally send us digested DNA Run it out on a gel to see Cause too much DNA Solution while there is a range of DNA concentrations we can sequence reliably too much DNA will cause premature termination of signal Overloading of DNA will exhibit early top heavy peaks followed by rapidly weakening peak height and strength This occurs because the dANTPS in the cycle sequencing reaction will be distributed among too many extending chains and will be depleted early on resulting in an excessive amount of shor
16. ons will show strong clean signal up to a point and then drop sharply down over the course of a few nucleotides to much weaker or no detectable signal Gradual truncations will show good sequence data initially but then begins to taper off to progressively weaker smaller peaks until there is nothing but background noise The nature of the truncation can sometimes help to determine its cause Cause secondary structure Solution G C rich and to a lesser degree A T rich DNA is predisposed to secondary structure formation as strong hydrogen bonding between G and C nucleotides can cause the template DNA to loop or bend and anneal to complementary sequences forming hairpins that can restrict the passage of the sequencing polymerase and thus be very difficult to sequence through reliably These hairpins may not melt at our cycle sequencing temperatures and can cause premature termination of sequence data Secondary structure may appear as a sharp termination of signal with no sequence data after or if the loop has been relaxed slightly you may see strong signal that drops abruptly but may have some weaker peaks following that are still quite accurate With the newest formulation of BigDye Terminator chemistries v3 1 some G C rich difficulties have improved dramatically but unfortunately it hasn t solved everything There is not one solution that resolves every secondary structure problem but there are couple you can try and usually one will al
17. oth the forward and reverse primer in the PCR reaction giving rise to an artifactual product This is fairly easy to detect when sequencing the PCR product as one primer will give double peaks from the start while the other fails to give any sequence data Redesign your set of PCR primers IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca Cause residual PCR primers and or dNTPs Solution as two primers are present in the PCR reaction incomplete removal of these primers can lead to double peaks within the sequencing data Both primers will act as sequencing primers and lead to superimposed bands which correspond to the complementary strands from opposite orientations It is critical to remove excess primers and dNTPs from the PCR reaction by purification If attempting to do direct sequencing of PCR products without purification by diluting an aliquot of your PCR product with water to lower the concentration of residual primers and ANTPS a method which we do not recommend then it is imperative to optimize your PCR reaction so that primers and dNTPS are used in limiting amounts so that most are used up by the end of the PCR Cause primers with high Tm Solution primers that have a Tm much higher gt 65 C than our suggested 52 C 58 C often do not function well as sequencing primers When primers have a Tm that high it is often a result of increased G C content or because the
18. t fragments In addition if your template is impure higher concentrations of DNA can be accompanied by higher amounts of contaminants that can further worsen your DNA sequence quality Cause salts Solution excessive amounts of salts will also give rise to premature termination and may look similar to DNA overloading with strong signal followed by progressively weakening signal Salts have an inhibitory effect on the processivity of the sequencing Taq polymerase which can lead to an overabundance of short fragments or if the salt concentration is too great the enzyme will be completely inhibited with no sequence data obtained If salts are potentially a problem perform an ethanol precipitation for salt removal Cause repetitive regions TEGGGTGGGTGTGTGGTGTGTGTGIGTGTGTTGTTTCCTICTGTCTTTGTCAAATAGCATGCTGCTATI 45 E 440 450 466 479 a it MAN Winey iF a Adi AV yy vie WL AU by i Mi l i l i i My Way IN 10 IIC Genomic Platform Institute of research in immunology and cancer University of Montreal www genomique iric ca Solution the nucleotide composition as well as the size of a repetitive region can play a large role in the success of sequencing through such an area In general G C and G T ofter seen in bisulfite treated DNA repeats tend to be the most troublesome though as mentioned before the newest version of Applied Biosystems BigDye Terminator v3 1 contains some modifications that have allowed for some s
19. triking improvements in certain previously difficult templates However there are still some that remain a pain In general one can sequence partially through the repetitive region and the signal begins to fade and eventually becomes unreadable This may be due to premature dNTP depletion secondary structure formation or enzyme slippage Various methods can be tried to sequence the repeat entirely and many are similar to those we would use for G C rich templates that form secondary structures If the repeat region is not excessively large sequencing from the opposite strand to complete the region can be successful especially if the complementary strand has a nucleotide composition that is more efficiently extended However if the region is large it may be difficult to complete its entire sequence and determine the exact number of repeats present Alternative methods such as directed deletions or the use of an in vitro transposon system may need to be utilized Plasmid templates Considerations when cleaning up plasmid preps Poor template quality is one of the most common reasons for bad sequence data as mentioned above and is a prime consideration when choosing a plasmid cleanup method to give DNA of optimal purity for automated sequencing Plasmid template quality can be affected by a variety of factors and contaminants including the following e Salts or organics left over from template preparation Presence of cellular components such as RN

Download Pdf Manuals

image

Related Search

Related Contents

文化祭展示作品作り - 14期大宮校パソコンクラブ  

Copyright © All rights reserved.
Failed to retrieve file