Search for Genetic Material
By the late 19th century, scientists knew that traits pass from parents to offspring. But which molecule inside the cell carries this information? The hunt for the identity of the genetic material spanned three landmark experiments.
Griffith's Experiment (1928)
Frederick Griffith worked with two strains of Streptococcus pneumoniae (also called Diplococcus pneumoniae):
- S-strain (Smooth): has a polysaccharide capsule; colonies look smooth. Virulent — kills mice when injected.
- R-strain (Rough): no capsule; colonies look rough. Non-virulent — harmless to mice.
- Live R-strain alone: mouse survives.
- Live S-strain alone: mouse dies.
- Heat-killed S-strain alone: mouse survives.
- Heat-killed S-strain + live R-strain: mouse dies. Living S-strain bacteria were recovered from the dead mouse!
Griffith's conclusion
The heat-killed S-strain released a "transforming principle" that converted live R-strain into live, virulent S-strain. Griffith did NOT know what the transforming principle was. The transformation was heritable.
Avery, MacLeod, and McCarty Experiment (1944)
Oswald Avery, Colin MacLeod, and Maclyn McCarty purified the transforming principle from S-strain and tested each class of biomolecule by treating the extract with specific enzymes before adding to R-strain:
- RNase treatment: transformation still occurred. RNA is not the transforming principle.
- Protease treatment: transformation still occurred. Protein is not the transforming principle.
- DNase treatment: transformation was destroyed. DNA IS the transforming principle.
Significance
This was the first biochemical evidence that DNA is the genetic material. However, many scientists still doubted it because the protein contamination in their preparations was not fully ruled out.
Hershey-Chase Experiment (1952)
Alfred Hershey and Martha Chase used bacteriophage T2 (a virus that infects E. coli) to settle the debate. Their key insight: protein contains sulphur (not phosphorus), and DNA contains phosphorus (not sulphur).
- Experiment 1: Phage protein labelled with 35S (radioactive sulphur). After infecting bacteria + blending (to separate phage coat from bacteria) + centrifugation: 35S found in the supernatant OUTSIDE bacteria (in the phage coat). Bacteria had very little 35S.
- Experiment 2: Phage DNA labelled with 32P (radioactive phosphorus). After infection + blending + centrifugation: 32P found INSIDE the bacteria. Very little 32P in the supernatant.
Conclusion: DNA is the genetic material
Only the phage DNA entered the bacterial cell. The protein coat stayed outside (it is only a "ghost" used to deliver DNA). The injected DNA directed the production of new phage particles inside the bacteria. This definitively proved DNA is the genetic material.
RNA as genetic material
In some viruses (like Tobacco Mosaic Virus — TMV, and many RNA viruses including HIV, influenza, poliovirus), the genetic material is RNA, not DNA. The RNA World hypothesis proposes that RNA was the original genetic material before DNA evolved, because RNA can both store information AND catalyse reactions (ribozymes).
DNA Double Helix: Watson and Crick Model (1953)
James Watson and Francis Crick proposed the double helix model in 1953, using X-ray diffraction data from Rosalind Franklin and Maurice Wilkins (particularly Franklin's "Photo 51"), and Chargaff's base composition data. They won the Nobel Prize in 1962 (along with Wilkins; Franklin had died in 1958 and the prize is not awarded posthumously).
Structure of a Nucleotide
DNA is a polymer of deoxyribonucleotides. Each nucleotide has three components:
- Phosphate group (negatively charged; forms the backbone)
- Deoxyribose sugar (5-carbon sugar; the 2\'-carbon lacks an -OH group, making it "deoxy")
- Nitrogenous base — one of four: Adenine (A), Guanine (G) — purines (double ring); Thymine (T), Cytosine (C) — pyrimidines (single ring)
Nucleotides are linked by 3\'-5\' phosphodiester bonds — the 3\' carbon of one sugar is bonded to the 5\' carbon of the next through a phosphate.
Features of the Double Helix
- Two antiparallel strands: one strand runs 5\' to 3\', the other runs 3\' to 5\'. Antiparallel means they run in opposite directions.
- Base pairing (complementary): A pairs with T (2 hydrogen bonds); G pairs with C (3 hydrogen bonds). These are called Watson-Crick base pairs.
- Sugar-phosphate backbone is on the OUTSIDE; nitrogenous bases are on the INSIDE, stacked on top of each other (base stacking also contributes to stability).
- Right-handed helix (B-form DNA is the standard form in cells).
- Major groove (wider, ~22 Å) and minor groove (narrower, ~12 Å) alternate. Transcription factors and other proteins read the base sequence by binding the major groove.
DNA dimensions — memorise these for NEET
- 0.34 nm (3.4 Å) — distance between two consecutive base pairs (rise per base pair)
- 3.4 nm (34 Å) — pitch (one complete turn of helix)
- 10 — base pairs per complete turn (3.4 nm ÷ 0.34 nm)
- 2 nm (20 Å) — diameter of the double helix
Chargaff's Rules
Erwin Chargaff (1950) analysed DNA base composition from many organisms and found consistent ratios. His rules are the foundation of base pairing:
- A = T (molar amounts of adenine and thymine are equal)
- G = C (molar amounts of guanine and cytosine are equal)
- A + G = T + C (total purines = total pyrimidines)
- The A+T / G+C ratio varies between species (species-specific)
These rules only apply to double-stranded DNA. In single-stranded DNA or RNA, A does not necessarily equal T (or U).
DNA base pairing and Chargaff's rules
Explore Watson-Crick base pairs and use Chargaff's rules to calculate unknown base percentages.
Select a base to see its complement:
Adenine (Purine)
— —
2 hydrogen bonds
Thymine (Pyrimidine)
A-T pairs have 2 H-bonds. AT-rich regions melt more easily (used as origins of replication).
Chargaff's rules calculator
In double-stranded DNA: A = T and G = C. Drag to set %A and the rest auto-calculates.
% Adenine (A): 30%
A
30.0%
T
30.0%
G
20.0%
C
20.0%
A + G = 50.0% (purines) = T + C = 50.0% (pyrimidines)
DNA structure key numbers for NEET:
- 0.34 nm (3.4 Å) between consecutive base pairs
- 3.4 nm (34 Å) per complete turn of helix
- 10 base pairs per turn
- 2 nm (20 Å) diameter of the helix
- Right-handed B-form double helix
Try this
- Set A = 20%: you get T = 20%, G = C = 30%. Now try A = 30%: T = 30%, G = C = 20%. This is exactly how NEET asks Chargaff's rule questions!
- Click G to see it forms 3 H-bonds with C (stronger than A-T). That's why GC-rich regions are harder to melt — important for PCR primer design.
Packaging of DNA: From Nucleosome to Chromosome
A human cell contains about 2 metres of DNA if stretched out. It must be packaged into a nucleus of just 6 micrometres (6 µm) in diameter. This is achieved through several levels of compaction, starting with the nucleosome.
Level 1: The Nucleosome (10 nm fibre)
The nucleosome is the fundamental repeating unit of chromatin. It consists of:
- Histone octamer: 8 histone proteins — two copies each of H2A, H2B, H3, and H4. Histones are rich in positively charged amino acids (lysine and arginine), allowing them to interact with the negatively charged phosphate groups of DNA.
- 146 bp of DNA wound 1.65 times around the histone octamer.
- H1 (linker histone): sits at the entry and exit of DNA on the nucleosome; it is NOT part of the octamer. H1 is responsible for higher-order chromatin compaction.
- Linker DNA: 20-80 bp of DNA connecting adjacent nucleosomes.
The 10 nm fibre looks like "beads on a string" under the electron microscope: each bead is a nucleosome; the string is linker DNA.
Level 2: 30 nm Solenoid
The 10 nm fibre folds into a 30 nm solenoid (also called the 30 nm chromatin fibre) with 6 nucleosomes per turn. H1 histone is required for this compaction step. This gives about a 6-fold compaction beyond the nucleosome level.
Level 3: 300 nm Looped Domains
The 30 nm fibre forms loops attached to a protein scaffold (made of non-histone chromosomal proteins), giving 300 nm fibres. These scaffold-attached regions (SARs) are gene regulatory elements.
Level 4: Metaphase Chromosome (1400 nm)
The fully condensed metaphase chromosome is about 1400 nm wide. The total packing ratio from naked DNA to metaphase chromosome is about 10,000-fold. This maximum compaction occurs during cell division so chromosomes can be accurately segregated.
Packing levels summary
- Naked DNA → 10 nm fibre (nucleosome): ~6-7× compaction
- 10 nm → 30 nm solenoid: 6 nucleosomes per turn, ~40× compaction total
- 30 nm → 300 nm loops (scaffold): ~1,000× compaction total
- 300 nm → 700 nm → 1,400 nm metaphase chromosome: ~10,000× total
DNA Replication (Semiconservative)
During cell division, DNA must be copied exactly so that each daughter cell receives a complete genome. DNA replication is the process by which a double-stranded DNA molecule is duplicated to produce two identical daughter molecules.
Meselson-Stahl Experiment (1958): Proving Semiconservative Replication
Three models existed for how DNA could be copied:
- Conservative: both original strands stay together; both new strands pair with each other. After one generation: one heavy (15N-15N) + one light (14N-14N) DNA.
- Semiconservative: each daughter double helix gets one original strand and one new strand. After one generation: all DNA is intermediate (15N-14N hybrid).
- Dispersive: old and new DNA are randomly mixed in both strands. After one generation: all DNA is intermediate; after two generations: still all intermediate (getting lighter).
Meselson and Stahl grew E. coli in 15N medium (heavy nitrogen) for many generations, making all DNA uniformly heavy. They then transferred to 14N medium and harvested cells at each generation. DNA was separated by CsCl density-gradient centrifugation:
- After 1 generation: ONE band at INTERMEDIATE density (15N-14N hybrid). This ruled out conservative replication (which would give two bands).
- After 2 generations: TWO bands — 50% intermediate + 50% light (14N-14N). This ruled out dispersive replication (which would give lighter intermediate bands only).
Conclusion: DNA replication is semiconservative
Each daughter DNA molecule contains one parental (template) strand and one newly synthesised strand. This was confirmed by the exact ratio of intermediate to light DNA seen after two generations.
The Replication Fork
Replication begins at specific sequences called origins of replication (ORI). In E. coli, there is one ORI (oriC); in eukaryotes, there are multiple origins to replicate the larger genome faster. At each origin, the two strands separate to form a replication bubble with two replication forks that move in opposite directions (bidirectional replication).
Key rule: DNA polymerase can only synthesise DNA in the 5\' to 3\' direction (adding new nucleotides to the 3\'-OH of the growing chain). This creates an asymmetry at the fork:
- Leading strand: template runs 3\' to 5\' in the same direction as fork movement. Synthesised continuously 5\' to 3\'. Requires only ONE RNA primer.
- Lagging strand: template runs 5\' to 3\' (opposite to fork movement). Synthesised discontinuously as Okazaki fragments, each requiring its own RNA primer. Okazaki fragments are ~1,000-2,000 nt in bacteria and ~100-200 nt in eukaryotes.
Replication fork enzyme map
Click each enzyme at the replication fork to see its role, location, and why it matters for NEET.
Replication enzymes (click to explore):
Helicase
ROLE
Unwinds the double helix by breaking hydrogen bonds between base pairs. Uses ATP.
LOCATION AT FORK
At the replication fork — the "engine" that opens the helix.
Schematic of the replication fork:
3'─────────────────────────────── 5' (parental template)
↑ HELICASE (opens helix)
5'─────────────────── 3' (parental template)
LEADING strand (synthesised continuously toward fork):
5'─PRIMER─────────────────────────────→ 3' (DNA Pol III)
LAGGING strand (synthesised away from fork as Okazaki fragments):
←─Fragment 3─|←─Fragment 2─|←─Fragment 1─ (each with own primer)
DNA Pol I removes primers; Ligase joins fragmentsLeading strand
Synthesised CONTINUOUSLY 5'→3' toward the replication fork. Only one RNA primer needed.
Lagging strand
Synthesised DISCONTINUOUSLY as Okazaki fragments (away from fork). Each fragment needs its own RNA primer.
Okazaki fragments
1,000-2,000 nt in bacteria; 100-200 nt in eukaryotes. Later joined by DNA ligase after RNA primers are replaced.
Try this
- Click "Primase" — remember it is the ONLY enzyme that can start a new polynucleotide chain from scratch. DNA Pol III cannot.
- Click "DNA Pol I" — it does TWO things: removes the RNA primer AND fills the DNA gap. It has 5'→3' exonuclease AND 5'→3' polymerase activities.
- NEET trap: the lagging strand needs MULTIPLE RNA primers (one per Okazaki fragment), while the leading strand needs only ONE.
Accuracy of Replication
DNA polymerase has a proofreading activity (3\' to 5\' exonuclease): if a wrong nucleotide is added, the enzyme removes it and replaces it with the correct one. The error rate is reduced to approximately 1 mistake per 10⁹ to 10¹⁰ base pairs — an extraordinary level of fidelity.
Transcription: DNA to RNA
Transcription is the synthesis of RNA from a DNA template. Only one strand of DNA is used as the template, and the RNA produced has the complementary sequence.
Template Strand vs Coding Strand
- Template strand (antisense strand, non-coding strand, (-) strand): read by RNA polymerase in the 3\' to 5\' direction. Its sequence is complementary to the mRNA.
- Coding strand (sense strand, non-template strand, (+) strand): has the SAME sequence as the mRNA (with T replaced by U). It is NOT read during transcription.
- The mRNA is produced in the 5\' to 3\' direction, complementary to the template strand.
Example
Template strand (3\'→5'): 3\'-TACGGGCTA-5\'
Coding strand (5\'→3'): 5\'-ATGCCCGAT-3\'
mRNA produced (5\'→3'): 5\'-AUGCCCGAU-3\'
Transcription in Prokaryotes
In bacteria (E. coli is the model organism), one RNA polymerase (with multiple subunits) does all transcription. The sigma (σ) factor is a dissociable subunit that recognises the promoter (specifically the Pribnow box at -10 and the -35 element). The steps are:
- Initiation: sigma factor directs RNA polymerase holoenzyme (α₂ββ\' + σ) to the promoter. The DNA strands are melted locally.
- Elongation: sigma factor dissociates; core enzyme (α₂ββ\') moves along the template adding ribonucleotides. No primer needed — RNA polymerase can initiate de novo.
- Termination: the RNA polymerase reaches a terminator sequence. Two types: (1) Rho-independent (intrinsic): RNA forms a hairpin loop followed by a poly-U sequence; (2) Rho-dependent: Rho protein catches up to RNA pol and triggers dissociation.
In bacteria, transcription and translation are coupled: ribosomes bind the 5\' end of the mRNA and begin translation while transcription is still occurring (the mRNA does not need to leave the "nucleus" because there is no nucleus).
Transcription in Eukaryotes
Eukaryotic transcription occurs in the nucleus and is more complex:
- Three RNA polymerases: RNA Pol I (makes rRNA — 28S, 18S, 5.8S), RNA Pol II (makes mRNA and some snRNAs), RNA Pol III (makes tRNA, 5S rRNA).
- RNA Pol II requires a complex of general transcription factors (TFIIA, TFIIB, TFIID — contains TBP that binds the TATA box, TFIIE, TFIIF, TFIIH) to form a preinitiation complex at the promoter.
- The pre-mRNA (hnRNA = heterogeneous nuclear RNA) undergoes extensive post-transcriptional processing before becoming mature mRNA.
Post-Transcriptional Processing (Eukaryotes only)
- 5\' Capping: a 7-methylguanosine (m⁷G) cap is added to the 5\' end by guanylyltransferase. Protects mRNA from degradation by 5\' exonucleases; helps ribosome recognition during translation initiation.
- 3\' Polyadenylation: a poly-A tail (100-200 adenine nucleotides) is added to the 3\' end. Protects from 3\' degradation; helps nuclear export and translation efficiency.
- Splicing: introns (non-coding intervening sequences) are removed by the spliceosome (made of snRNPs — small nuclear ribonucleoprotein particles), and exons (coding sequences) are joined. Alternative splicing can generate different proteins from the same gene.
NEET key: introns vs exons
Introns = intervening sequences = non-coding sequences present in DNA and pre-mRNA; REMOVED during splicing. Exons = expressed sequences = coding sequences; RETAINED in the mature mRNA. "Exons are expressed; introns intervene and are removed."
Transcription: DNA template to mRNA
Enter a DNA template strand (3'→5') and see the coding strand and mRNA produced by RNA polymerase.
Enter DNA template strand (3'→5'). Use A, T, G, C only:
Template strand (3'→5'):
Read by RNA polymerase 3'→5'
Coding strand (5'→3'):
Same sequence as mRNA (T→U); not transcribed
mRNA produced (5'→3'):
Complementary to template; used for translation
mRNA codons (groups of 3):
Transcription rules to remember:
- Template strand is read 3'→5'; mRNA is made 5'→3'
- DNA A → RNA U (uracil replaces thymine in RNA)
- DNA T → RNA A; DNA G → RNA C; DNA C → RNA G
- The coding strand has the SAME sequence as mRNA (replace T with U)
- AUG = start codon; UAA / UAG / UGA = stop codons
Try this
- Try template: TAC — the mRNA will be AUG, the start codon. AUG codes for methionine, the first amino acid in all eukaryotic proteins.
- Try ATT on the template → mRNA becomes UAA, which is the Ochre STOP codon.
Genetic Code: Codons and Their Properties
The genetic code is the set of rules by which the information encoded in mRNA (as codons — sequences of 3 nucleotides) is translated into the amino acid sequence of proteins. Breaking the genetic code was a major scientific achievement of the 1960s.
Deciphering the Code
- Francis Crick (1961): used genetic experiments to show the code is triplet, non-overlapping, and has a fixed reading frame.
- Nirenberg and Matthaei (1961): used a cell-free translation system with synthetic poly-U mRNA. Only one amino acid was incorporated: phenylalanine. Therefore UUU = phenylalanine — the first codon to be deciphered.
- H. Gobind Khorana: synthesised polynucleotides with known repeating sequences to decipher more codons.
- Robert Holley: determined the first tRNA sequence.
- Nirenberg, Khorana, and Holley shared the 1968 Nobel Prize in Physiology or Medicine.
Properties of the Genetic Code
- Triplet: each codon consists of 3 nucleotides (codons). 4³ = 64 possible codons from 4 bases (A, U, G, C).
- Non-overlapping: each nucleotide belongs to only one codon. The reading frame is fixed from the start codon.
- Commaless (continuous): there are no "punctuation" nucleotides between codons; they are read continuously.
- Degenerate (redundant): most amino acids are specified by MORE than one codon (synonymous codons). With 64 codons and 20 amino acids + 3 stop codons, degeneracy is inevitable. Most degeneracy is at the third (3\') position — called the "wobble position" (Crick's wobble hypothesis).
- Unambiguous: each codon codes for only ONE specific amino acid (no ambiguity). One codon maps to exactly one amino acid.
- Universal: the same genetic code is used by nearly all living organisms — from bacteria to plants to humans — implying a single common ancestor. Exceptions: mitochondria (human mitochondria: UGA = Trp, AGA = Stop), some protists.
- Start codon: AUG — codes for methionine (eukaryotes) or N-formylmethionine (prokaryotes). It also sets the reading frame.
- Stop codons (nonsense codons): UAA (Ochre), UAG (Amber), UGA (Opal or Umber). They do not code for any amino acid. Recognised by release factors, not tRNA.
Must-know codons for NEET
- AUG = Start codon (Met) — only start codon
- UAA, UAG, UGA = Stop codons (3 total)
- UUU = Phenylalanine (first codon deciphered by Nirenberg)
- UGG = Tryptophan (Trp) — only amino acid with ONE codon (no degeneracy)
- AUG is also the only codon for Met — two amino acids with single codons: Met and Trp
Genetic code explorer: all 64 codons
Build any mRNA codon base by base and find which amino acid it codes for. Spot start and stop codons instantly.
Build your codon (pick each position):
1st base (5' end)
2nd base
3rd base (3' end)
Methionine (START)
Must-know codons for NEET:
AUG
START / Met
UAA
STOP (Ochre)
UAG
STOP (Amber)
UGA
STOP (Opal)
UUU
Phe — 1st decoded
UGG
Trp — only 1 codon
Genetic code properties (NEET must-know):
- TRIPLET: 3 bases per codon → 64 combinations (4³)
- DEGENERATE: 20 amino acids but 61 sense codons → multiple codons per amino acid
- UNAMBIGUOUS: one codon → only one amino acid (never ambiguous)
- UNIVERSAL: same code in almost all organisms (exceptions: mitochondria)
- NON-OVERLAPPING and COMMALESS: read sequentially without gaps
Try this
- Click "AUG" in the key codons box — this loads the start codon. AUG is the only start codon and also codes for methionine (Met).
- UGG codes for tryptophan and is the only amino acid with just ONE codon (not degenerate). Try it!
- UAA, UAG, UGA are the three stop codons. None of them codes for an amino acid.
tRNA: The Adapter Molecule
Transfer RNA (tRNA) is the physical link between the mRNA codon and the amino acid. Francis Crick called it the "adapter molecule" in his adapter hypothesis (1958). tRNA is required because amino acids do not directly recognise nucleotide sequences — there is no direct chemical affinity between a codon and an amino acid.
Structure of tRNA
- Size: 73-93 nucleotides (smallest RNA molecule in the cell, apart from some short regulatory RNAs).
- Cloverleaf 2D structure: has four stems (loops) due to internal base pairing: the acceptor stem, D-loop (dihydrouridine loop), anticodon loop, TψC loop, and optional variable loop.
- 3\' CCA tail: every tRNA ends in the sequence 5\'-CCA-3\' at the 3\' end. This is where the amino acid is attached (esterified) by the enzyme aminoacyl-tRNA synthetase.
- Anticodon loop (positions 34-36): contains the three-base anticodon that base-pairs with the complementary mRNA codon during translation. The anticodon base-pairs in antiparallel orientation with the codon.
- Unusual bases: tRNA contains modified nucleotides: pseudouridine (ψ), dihydrouridine (D), inosine (I), methylated bases. These are important for tRNA stability and function.
Charging of tRNA
Each amino acid is attached to its specific tRNA by an enzyme called aminoacyl-tRNA synthetase (one per amino acid = 20 enzymes). The reaction uses ATP energy: amino acid + tRNA + ATP → aminoacyl-tRNA + AMP + PPi. This charged tRNA is called an aminoacyl-tRNA. The aminoacyl-tRNA synthetase must correctly match each amino acid to its tRNA — this is sometimes called "the second genetic code" because the specificity here is as important as codon-anticodon recognition.
Wobble Hypothesis
Crick proposed that the base pairing at the third (wobble) position of the codon is less strict. One tRNA can recognise multiple codons if they differ only at the third position. For example, inosine (I) in the anticodon can pair with U, C, or A in the codon's wobble position. This explains how fewer than 61 different tRNAs are needed to decode 61 sense codons (some organisms have as few as 40-45 tRNA types).
Translation: Protein Synthesis
Translation is the synthesis of a polypeptide from the mRNA sequence. It occurs at the ribosome and requires mRNA, ribosomes, aminoacyl-tRNAs, and various protein factors.
Ribosome Structure
- Prokaryotic (70S): large subunit (50S) = 23S rRNA + 5S rRNA + ~31 proteins; small subunit (30S) = 16S rRNA + ~21 proteins.
- Eukaryotic cytoplasmic (80S): large subunit (60S) = 28S rRNA + 5.8S rRNA + 5S rRNA + ~49 proteins; small subunit (40S) = 18S rRNA + ~33 proteins.
- Eukaryotic mitochondrial and chloroplast ribosomes: 70S (like prokaryotes) — evidence for the endosymbiotic theory.
The ribosome has three tRNA binding sites: P-site (peptidyl-tRNA site — holds the growing chain), A-site (aminoacyl-tRNA site — holds the incoming charged tRNA), E-site (exit site — holds the uncharged tRNA about to leave).
The peptidyl transferase activity (forms the peptide bond) is catalysed by the 23S rRNA (in prokaryotes) — the ribosome is a ribozyme! The protein components are structural, not catalytic.
Steps of Translation
- Initiation: the small ribosomal subunit binds the mRNA at the 5\' cap (eukaryotes) or at the Shine-Dalgarno sequence near the AUG start codon (prokaryotes). The initiator aminoacyl-tRNA (Met-tRNA in eukaryotes; fMet-tRNA in prokaryotes) enters the P-site and base-pairs with the AUG start codon. The large subunit then joins. Requires initiation factors (IFs in prokaryotes; eIFs in eukaryotes) and GTP.
- Elongation: (a) A new aminoacyl-tRNA enters the A-site (facilitated by elongation factor EF-Tu in prokaryotes + GTP). (b) Peptidyl transferase catalyses transfer of the growing polypeptide from the P-site tRNA to the amino acid at the A-site (peptide bond formation). (c) Translocation: the ribosome moves 3 nucleotides (one codon) in the 5\' to 3\' direction — A-site tRNA (now peptidyl-tRNA) moves to P-site; P-site tRNA (now uncharged) moves to E-site; E-site tRNA exits. Requires EF-G + GTP in prokaryotes.
- Termination: a stop codon (UAA, UAG, or UGA) enters the A-site. No aminoacyl-tRNA recognises stop codons; instead, a release factor (RF1 recognises UAA/UAG; RF2 recognises UAA/UGA in prokaryotes) enters. RF activates peptidyl transferase to transfer the polypeptide to a water molecule (hydrolysis), releasing it. The ribosome dissociates into its subunits.
Polysome (polyribosome)
A single mRNA can be translated simultaneously by multiple ribosomes (a polysome). Each ribosome moves along the mRNA in the 5\' to 3\' direction, translating as it goes. This greatly increases the efficiency of protein production from a single mRNA molecule.
Regulation of Gene Expression: Lac Operon
Cells do not express all genes at all times. Gene regulation allows cells to respond to environmental changes and save energy. The lac operon, described by François Jacob and Jacques Monod in 1961 (Nobel Prize 1965), is the classic example of prokaryotic gene regulation.
Structure of the Lac Operon
The lac operon is a cluster of genes on the E. coli chromosome that are all transcribed together into a single polycistronic mRNA. It consists of:
- Regulatory gene (lacI): located upstream; produces the lac repressor protein (tetramer) continuously (constitutive expression). The repressor can bind the operator to block transcription.
- Promoter (P): where RNA polymerase binds to initiate transcription. Has both a CRP-cAMP binding site (positive control) and the polymerase binding site.
- Operator (O): a short DNA sequence (21 bp palindrome) that overlaps with the promoter and transcription start site. When the repressor binds the operator, it physically blocks RNA polymerase.
- lacZ: codes for β-galactosidase (cleaves lactose → glucose + galactose; also makes allolactose from lactose).
- lacY: codes for lactose permease (membrane transporter that imports lactose into the cell).
- lacA: codes for thiogalactoside transacetylase (exact role in lactose metabolism less clear; may help detoxify analogues).
Mechanism: Negative Control (Repressor)
- Without lactose (inducer absent): the lac repressor binds tightly to the operator → RNA polymerase cannot transcribe the structural genes → genes are OFF.
- With lactose present: β-galactosidase (a small amount is always made at basal level) converts some lactose to allolactose (the true inducer). Allolactose binds the lac repressor → repressor undergoes conformational change → cannot bind operator → RNA polymerase transcribes all three structural genes.
Positive Control: Catabolite Repression (CRP-cAMP)
Even when lactose is present, the cell PREFERS glucose. If glucose is available, the lac operon is only weakly expressed (glucose effect / catabolite repression):
- When glucose is high: adenylate cyclase is inhibited → cAMP levels are LOW → cAMP cannot bind CRP (catabolite activator protein / CAP) → CRP cannot bind the promoter → promoter activity is LOW even if lactose is present.
- When glucose is absent: adenylate cyclase is active → cAMP levels are HIGH → cAMP binds CRP → cAMP-CRP complex binds the CRP binding site in the promoter → RNA polymerase is recruited efficiently → STRONG transcription of structural genes.
Maximum lacZ/lacY/lacA expression requires BOTH conditions: lactose present (to remove the repressor) AND glucose absent (to allow cAMP-CRP activation).
Lac operon: toggle glucose and lactose
The classic Jacob-Monod model. Toggle glucose and lactose to see all 4 operon states.
Operon ON (full activity)
Lactose present: allolactose binds the repressor → operator is FREE. No glucose: cAMP levels are HIGH → cAMP binds CRP → CRP-cAMP activates the lac promoter strongly. BOTH conditions for full transcription are met. All three structural genes (lacZ, lacY, lacA) are highly expressed. The cell uses lactose as its energy source.
Operon structure (opacity shows transcription level):
P
RNA Pol binds here
O
Repressor binds here
lacZ
β-galactosidase
lacY
Permease
lacA
Transacetylase
mRNA transcribed →
All 4 lac operon states (NEET summary):
| Glucose | Lactose | Repressor | CRP-cAMP | Transcription |
|---|---|---|---|---|
| + | - | Bound | Inactive | OFF |
| - | - | Bound | Active | OFF |
| + | + | Free | Inactive | LOW |
| - | + | Free | Active | HIGH ✓ |
Try this
- Set Glucose OFF + Lactose ON — this is the MAXIMUM transcription state (high cAMP-CRP activation + repressor released).
- Set both OFF — no inducer means repressor blocks the gene even though CRP-cAMP would activate it. Two control points must BOTH be satisfied.
Inducible vs Repressible operons
- Inducible operon (lac operon): normally OFF; turned ON by an inducer (allolactose). Controls catabolic pathways (breaking down substrates).
- Repressible operon (trp operon): normally ON; turned OFF when the end product (tryptophan) accumulates. Tryptophan acts as a co-repressor. Controls biosynthetic pathways (making amino acids, nucleotides, etc.).
Human Genome Project (HGP)
The Human Genome Project was an international scientific collaboration to sequence the entire human genome. It ran from 1990 to 2003 and was described as "the most important scientific project of the century."
Goals of the HGP
- Identify all ~20,000-25,000 genes in human DNA.
- Determine the sequences of all 3.2 billion base pairs of the human genome.
- Store this information in databases (NCBI, EBI).
- Develop tools for data analysis (bioinformatics).
- Transfer technologies to private sectors.
- Address the ethical, legal, and social (ELSI) issues that arise.
Technologies Used
- BAC (Bacterial Artificial Chromosome): can hold 150-350 kb inserts; used for mapping and sequencing large genomic fragments.
- YAC (Yeast Artificial Chromosome): can hold even larger inserts (up to 2,000 kb); used in the early stages of the HGP.
- EST (Expressed Sequence Tags): partial sequences of cDNA (made from mRNA by reverse transcriptase); used to identify genes that are expressed.
- Shotgun sequencing: DNA is fragmented randomly, each fragment sequenced, and the sequences assembled computationally. Used by Craig Venter's team at Celera Genomics as an alternative approach.
- Dideoxy (Sanger) sequencing: the original DNA sequencing method used extensively in the HGP.
Key Findings of the HGP
- ~3,164.7 million base pairs (approximately 3.2 billion bp) in the haploid human genome.
- ~20,000-25,000 protein-coding genes (far fewer than the earlier estimate of 100,000+). Surprising — the roundworm C. elegans has ~19,000 genes.
- Only ~2% of the genome codes for proteins. The rest was initially called "junk DNA" but much of it has regulatory functions (enhancers, insulators, non-coding RNAs).
- Average gene is ~3,000 bp, but genes vary widely (dystrophin gene = 2.4 million bp).
- ~1.5 million SNPs (Single Nucleotide Polymorphisms) — single base changes between individuals. Useful for disease mapping.
- Humans are 99.9% genetically identical to each other. The 0.1% differences account for all human genetic variation.
- Repeated sequences (microsatellites, SINEs, LINEs) make up a large portion of the genome.
- Chromosome 1 has the most genes; Y chromosome has the fewest.
DNA Fingerprinting
DNA fingerprinting (also called DNA profiling or DNA typing) is a technique that exploits the fact that each person has a unique DNA sequence (except identical twins). It was developed by Alec Jeffreys at the University of Leicester in 1984 and first used in a legal case in 1986.
The Basis: VNTRs (Variable Number of Tandem Repeats)
The human genome contains repetitive DNA sequences scattered throughout. A particular class called VNTRs (also called satellite DNA or minisatellites) are regions where a short sequence (6-100 bp) is repeated in tandem, and the NUMBER of repeats at each locus varies between individuals. For example, one person might have 7 repeats at a given locus; another might have 15 repeats. This variation is heritable and the probability that two unrelated individuals have exactly the same VNTR length at multiple loci is extremely small.
The Procedure
- DNA extraction from biological sample (blood, hair root, saliva, semen, tissue).
- Restriction digestion: cut DNA with restriction endonucleases at fixed sequences flanking the VNTR loci. This produces fragments of different sizes depending on how many repeats each person has.
- Gel electrophoresis: separate the fragments by size on an agarose gel. Smaller fragments migrate further.
- Southern blotting: transfer the DNA from the gel to a nylon or nitrocellulose membrane (DNA is denatured first with NaOH to make it single-stranded). Originally by capillary transfer; now often electrotransfer.
- Probe hybridisation: incubate the membrane with a labelled single-stranded DNA probe complementary to the VNTR sequences. The probe hybridises to complementary bands.
- Detection: autoradiography (if radioactive probe) or chemiluminescence (if non-radioactive). The resulting pattern of bands is the DNA fingerprint.
Applications
- Forensic science: identifying suspects from biological material left at crime scenes.
- Paternity testing: every band in a child must be present in either the mother or the biological father.
- Victim identification: identifying victims of disasters using family member DNA.
- Population genetics: studying genetic diversity and evolutionary relationships between species or populations.
- Conservation biology: tracking genetic diversity in endangered species.
DNA fingerprinting: reading gel electrophoresis bands
Each lane shows the VNTR band pattern for one individual. Bands at the same position indicate matching repeat sequences.
Child A has a disputed father. Two men are tested. Which man is the biological father?
High MW
Low MW
Mother
Child A← whose child?
Man 1
Man 2
Mother
Child A
Man 1
Man 2
How DNA fingerprinting works:
- Extract DNA from blood, hair root, saliva, or other cells.
- Cut with restriction enzymes at fixed sites flanking VNTR regions.
- Run on agarose gel electrophoresis — smaller fragments travel further.
- Southern blotting: transfer DNA from gel to nylon membrane.
- Hybridise with a labelled probe that binds to VNTR sequences.
- Autoradiography or chemiluminescence reveals the band pattern.
- Compare band patterns — identical patterns = same individual; shared bands = related.
Try this
- In the paternity test: compare Child A's bands with Mother first. Bands not in the mother MUST come from the biological father. Then check which man has all those remaining bands.
- DNA fingerprinting was invented by Alec Jeffreys in 1984 and first used in a UK immigration case and then famously in a forensic case in Leicestershire, UK.
Worked NEET Problems
NEET-style problem · Chargaff's rules
Question
Solution
By Chargaff's rules for double-stranded DNA: A = T, so T = 18%. Since A + T + G + C = 100%: G + C = 100 − 18 − 18 = 64%. Since G = C: G = C = 32%.
Answer: T = 18%, G = 32%, C = 32%.
NEET-style problem · Meselson-Stahl experiment
Question
Solution
After 0 generations (all ¹⁵N): 1 heavy molecule (¹⁵N-¹⁵N).
After 1 generation (semiconservative): 2 molecules, both hybrid (¹⁵N-¹⁴N).
After 2 generations: each hybrid replicates → 4 molecules total: 2 hybrid (¹⁵N-¹⁴N) + 2 light (¹⁴N-¹⁴N).
Answer: 50% of DNA molecules are light (¹⁴N-¹⁴N) after 2 generations.
NEET-style problem · Transcription
Question
Solution
RNA polymerase reads the template 3'→5' and synthesises mRNA 5'→3'. DNA A→RNA U; T→A; G→C; C→G.
Template 3'-ATCGGCTAG-5' → mRNA 5'-UAGCCGAUC-3'
Codons: UAG | CCG | AUC. UAG is a STOP codon (Amber). CCG = Proline, AUC = Isoleucine. No AUG start codon in this fragment.
NEET-style problem · Lac operon
Question
Solution
Two conditions must both be met for maximum transcription:
1. Lactose present: allolactose (derived from lactose) binds and inactivates the repressor, freeing the operator for RNA polymerase.
2. Glucose absent: cAMP rises, cAMP-CRP binds the promoter, strongly stimulating transcription.
Answer: LACTOSE present + GLUCOSE absent = maximum lac operon transcription.
Cheat Sheet
KEY EXPERIMENTS
- Griffith (1928): heat-killed S + live R → mouse dies. "Transforming principle" exists.
- Avery, MacLeod, McCarty (1944): DNase destroyed transformation. DNA = transforming principle.
- Hershey-Chase (1952): 32P (DNA) enters bacteria; 35S (protein) stays outside. DNA = genetic material.
- Watson-Crick (1953): double helix model. A-T (2 H-bonds), G-C (3 H-bonds), antiparallel, right-handed.
- Meselson-Stahl (1958): density gradient shows semiconservative replication. After 1 gen: 1 hybrid band. After 2 gen: 50% hybrid + 50% light.
- Nirenberg and Matthaei (1961): poly-U → polyphenylalanine. UUU = Phe. First codon deciphered.
- Jacob and Monod (1961): lac operon model. Nobel 1965.
DNA NUMBERS
0.34 nm
Distance per base pair
3.4 nm
Pitch (one full turn)
10
Base pairs per turn
2 nm
Diameter of helix
146 bp
DNA per nucleosome core
8
Histones in octamer
2×H2A,H2B,H3,H4
Nucleosome composition
H1
Linker histone
3.2 billion bp
Human genome size
~20-25,000
Human protein-coding genes
~2%
Genome that codes for protein
1.5 million
SNPs in human genome
GENETIC CODE MUST-KNOWS
- Total codons: 4³ = 64 (61 sense + 3 stop)
- Start codon: AUG (Met) — only one
- Stop codons: UAA (Ochre), UAG (Amber), UGA (Opal) — three total
- First decoded: UUU = Phenylalanine (Nirenberg + Matthaei, 1961)
- Single-codon amino acids: Methionine (AUG) and Tryptophan (UGG)
- Properties: triplet, non-overlapping, commaless, degenerate, unambiguous, universal
LAC OPERON QUICK STATES
- Glucose + / Lactose − → Repressor bound → OFF
- Glucose − / Lactose − → Repressor bound, CRP active but blocked → OFF
- Glucose + / Lactose + → Repressor free, CRP inactive → LOW
- Glucose − / Lactose + → Repressor free, CRP active → HIGH (maximum)
Molecular Basis of Inheritance NEET quiz
12 NEET-style questions on DNA structure, replication, transcription, genetic code, lac operon, and more. Question 1 of 12.
Q1 of 12 — Score: 0
Which experiment conclusively proved that DNA is the genetic material in bacteriophages?
Griffith's transformation experiment
Avery, MacLeod, and McCarty experiment
Hershey and Chase experiment using 35S and 32P
Meselson and Stahl's density gradient experiment
Frequently asked questions
How often does Molecular Basis of Inheritance appear in NEET?
Molecular Basis of Inheritance is a Very High Weightage chapter with 5 to 8 questions in most NEET exams. Questions focus on Hershey-Chase experiment, Chargaff's rules, DNA structure (dimensions, base pairs, bonds), nucleosome, Meselson-Stahl experiment, replication enzymes, transcription (template strand, promoter), genetic code properties (degenerate, universal, start/stop codons), lac operon (inducible, structural genes, repressor), Human Genome Project (base pair count, gene count), and DNA fingerprinting (VNTRs). This is one of the highest-yield chapters in NEET Biology.
What is the difference between Griffith's, Avery's, and Hershey-Chase experiments?
Three landmark experiments established DNA as genetic material: (1) GRIFFITH'S EXPERIMENT (1928): Frederick Griffith showed that heat-killed S-strain (virulent) bacteria could transform live R-strain (non-virulent) bacteria into S-strain. He called the unknown agent the "transforming principle." He did not identify what the molecule was. (2) AVERY, MacLEOD, McCARTY EXPERIMENT (1944): They isolated the transforming principle and tested each class of molecule. DNase (enzyme that destroys DNA) abolished transformation; RNase and Protease did not. Conclusion: DNA is the transforming principle. (3) HERSHEY-CHASE EXPERIMENT (1952): Using bacteriophage T2, they labelled protein with 35S (sulphur, which is only in protein) and DNA with 32P (phosphorus, which is only in DNA). After infection, 32P was found INSIDE bacteria (DNA entered) while 35S stayed OUTSIDE (protein coat did not enter). This conclusively proved DNA is the genetic material. NEET trap: Griffith did not prove it was DNA; Avery proved it was DNA biochemically; Hershey-Chase confirmed it definitively.
What are Chargaff's rules and how do you use them in NEET calculations?
Chargaff's rules describe the base composition of double-stranded DNA: (1) A = T (adenine equals thymine in molar amounts). (2) G = C (guanine equals cytosine in molar amounts). (3) A + G = T + C (total purines = total pyrimidines). HOW TO USE: if you know the percentage of one base, you can find all others. Example: if A = 30%, then T = 30%, and G + C = 40%, so G = C = 20%. If G = 22%, then C = 22%, A + T = 56%, so A = T = 28%. These rules apply to double-stranded DNA only. In single-stranded DNA, RNA, and in the individual strands of a double helix, A does NOT necessarily equal T. Also: A + T / G + C ratio varies between species (useful for identifying organisms), but A + G / T + C = 1 always in ds-DNA. NEET frequently gives the % of one base and asks you to calculate the others.
What is the structure of DNA? Give all the key numbers for NEET.
The Watson-Crick double helix (B-form DNA) key facts for NEET: (1) TWO ANTIPARALLEL STRANDS: one runs 5'→3', the other 3'→5'. (2) BASE PAIRS: A pairs with T (2 hydrogen bonds), G pairs with C (3 hydrogen bonds). G-C pairs are stronger (GC-rich DNA melts at higher temperature). (3) BACKBONE: sugar-phosphate backbone on the outside; bases are on the inside. (4) DIMENSIONS: 0.34 nm (3.4 Å) between consecutive base pairs; 3.4 nm (34 Å) per complete turn; 10 base pairs per turn; 2 nm (20 Å) diameter. REMEMBER: 3.4 Å per bp, 34 Å per turn, 10 bp per turn. (5) GROOVES: major groove (wider, used by proteins for recognition) and minor groove (narrower). (6) RIGHT-HANDED helix (B-form). (7) Each nucleotide = phosphate + deoxyribose sugar + nitrogenous base. Purines: Adenine (A), Guanine (G) — double-ring. Pyrimidines: Thymine (T), Cytosine (C), Uracil (U in RNA) — single ring.
What is a nucleosome and how is DNA packaged into a chromosome?
DNA packaging occurs in several levels of organisation: (1) NUCLEOSOME: 146 bp of DNA wound 1.65 times around a histone OCTAMER (2 copies each of H2A, H2B, H3, H4). H1 is the LINKER histone that sits outside the nucleosome at the entry and exit of DNA. The 10 nm "beads-on-a-string" fibre is the basic level. (2) 30 nm SOLENOID: 6 nucleosomes per turn of the solenoid; this requires H1 histone. (3) 300 nm LOOPED DOMAIN structure: 30nm fibre attached to protein scaffold (non-histone proteins). (4) 700 nm chromatin: further compacted. (5) 1400 nm METAPHASE CHROMOSOME: the fully condensed form seen during cell division. Key NEET facts: Histones are positively charged (rich in lysine and arginine = basic amino acids); they interact with negatively charged DNA (phosphate groups). Nucleosome = the basic unit of chromatin. Non-histone chromosomal proteins form the scaffold.
What is semiconservative replication and how did Meselson-Stahl prove it?
Semiconservative replication means each daughter DNA double helix retains ONE original (parental) strand and ONE newly synthesised strand. The other proposed models were: conservative (both original strands stay together; both new strands together) and dispersive (original and new DNA are randomly scattered). MESELSON-STAHL EXPERIMENT (1958): (1) E. coli was grown on 15N medium (heavy nitrogen isotope) for many generations, making all DNA "heavy" (15N-15N). (2) The bacteria were then transferred to 14N medium (light nitrogen). (3) After ONE generation: all DNA was of INTERMEDIATE density (15N-14N hybrid) — only ONE band in CsCl density gradient centrifugation. This RULED OUT conservative replication (which would give two bands: one heavy and one light). (4) After TWO generations: TWO bands appeared — 50% intermediate (15N-14N) and 50% light (14N-14N). This CONFIRMED semiconservative replication. If dispersive, all DNA would be intermediate at every generation.
What are the key enzymes in DNA replication?
DNA replication at the replication fork uses multiple enzymes working together: (1) HELICASE: unwinds the double helix by breaking hydrogen bonds; creates the replication fork. (2) SSB PROTEINS (Single-Strand Binding Proteins): stabilise the separated single strands; prevent re-annealing. (3) TOPOISOMERASE (gyrase): relieves the torsional stress (supercoiling) ahead of the replication fork. (4) PRIMASE: synthesises a short RNA primer (8-12 nucleotides) complementary to the template; necessary because DNA polymerase cannot start synthesis without a primer. (5) DNA POLYMERASE III: the main replication enzyme; adds new deoxyribonucleotides in the 5'→3' direction only; reads template 3'→5'. (6) DNA POLYMERASE I: removes the RNA primer and fills the gap with DNA. (7) DNA LIGASE: seals the nick (break) between adjacent DNA fragments (between Okazaki fragments on the lagging strand). LEADING strand: synthesised continuously (5'→3', towards the fork). LAGGING strand: synthesised discontinuously as OKAZAKI FRAGMENTS (short pieces, each starting with a primer). NEET trap: DNA polymerase can only add to an existing 3' -OH group; it cannot start a new strand. Only primase can initiate synthesis.
What are the key properties of the genetic code?
The genetic code translates mRNA codons into amino acids. Its properties for NEET: (1) TRIPLET: each codon is 3 nucleotides (64 possible codons from 4 bases: 4³ = 64). (2) NON-OVERLAPPING: each nucleotide belongs to only one codon; reading frame is fixed. (3) COMMALESS: no "punctuation" between codons; read continuously. (4) DEGENERATE (REDUNDANT): most amino acids are coded by MORE than one codon (64 codons but only 20 amino acids + 3 stop codons; wobble at the 3rd position). (5) UNIVERSAL: the same code is used by nearly all organisms (viruses, bacteria, plants, animals) — an argument for common ancestry. Exceptions: mitochondria and some protists use slightly different codes. (6) UNAMBIGUOUS: one specific codon codes for only one amino acid (no ambiguity). (7) START CODON: AUG (codes for methionine in eukaryotes; N-formylmethionine in prokaryotes). (8) STOP CODONS (NONSENSE): UAA, UAG, UGA — do not code for any amino acid; signal end of translation. (9) First codon deciphered: UUU = phenylalanine (by Marshall Nirenberg and H. Gobind Khorana, 1960s; Nobel Prize 1968).
How does the lac operon work?
The lac operon (Jacob and Monod, 1961) is the classic example of gene regulation in E. coli. It controls 3 structural genes needed to metabolise lactose. STRUCTURE: Promoter (P) + Operator (O) + Structural genes: lacZ (β-galactosidase — cleaves lactose to glucose + galactose), lacY (permease — transports lactose into cell), lacA (transacetylase — function less clear). Separately, the regulatory gene (i) produces the LAC REPRESSOR protein continuously. HOW IT WORKS: (1) NO LACTOSE present: repressor binds the operator → blocks RNA polymerase → genes are OFF. (2) LACTOSE present: allolactose (a metabolite of lactose) acts as the INDUCER — it binds the repressor → repressor changes shape → cannot bind operator → RNA polymerase proceeds → genes are ON. (3) CATABOLITE REPRESSION: when GLUCOSE is present, cAMP levels are low → CRP (catabolite activator protein) cannot bind → low transcription even if lactose is present. When glucose is absent, cAMP rises → cAMP-CRP complex binds the promoter → strong transcription. NEET KEY: The lac operon is an INDUCIBLE operon (genes normally OFF; turned ON by inducer). The trp operon is a REPRESSIBLE operon (genes normally ON; turned OFF by co-repressor).
Continue with the next chapter notes
Stay in NCERT order — the next chapter's notes are one click away.
Track Your NEET Score Across All 90 Chapters
Free 14-day trial. AI tutor, full mock tests and chapter analytics — built for NEET 2027.
Free 14-day trial · No credit card required