Documentation & User Guide
Overview
The Ribosome Decision Graph (RDG) is an interactive visualization tool for exploring alternative translation initiation in eukaryotic mRNAs.
It models how scanning ribosomes encounter multiple start codons and make probabilistic decisions to initiate or continue scanning,
ultimately producing diverse protein products (translons) from a single mRNA sequence.
What is a Ribosome Decision Graph?
The RDG is a directed graph that represents all possible translation paths through an mRNA:
- Nodes: Decision points at each start codon where ribosomes choose to initiate or scan past
- Edges: Translation events (translons) from start to stop, or scanning events moving to the next decision point
- Probabilities: Each path has an associated probability based on start codon context (Kozak sequence, codon type, local structure)
- Outcomes: Final protein products with predicted relative abundances
Getting Started
Loading Sequences
You can load sequences in three ways:
- Manual Input: Paste any mRNA sequence (AUGC format) directly into the text area
- Gene Symbol: Enter a human gene symbol (e.g., ATF4, BRCA1, TP53) to fetch the canonical transcript from Ensembl
- Transcript ID: Enter a specific Ensembl transcript ID (e.g., ENST00000337304) for precise control
Note: Fetched sequences are trimmed to the first 1500 nucleotides for optimal performance. This typically includes the full 5'UTR and initial coding region where most regulatory elements reside.
Understanding the Visualization
The RDG has two main view modes:
- Tree View: Shows the complete decision tree with branching paths at each start codon
- Frame tracks (green, blue, orange) represent the three reading frames
- Circles indicate start codon positions (size = Kozak strength)
- Bars show translation events from start to stop
- Branch points show where ribosomes decide: initiate (go down) or scan past (continue right)
- Edge labels display probabilities and translon names (e.g., "T1 (45.2%)")
- Flux View: Shows ribosome traffic through the sequence over time
- Animated ribosomes (colored circles) scan 5' to 3'
- Ribosome color indicates which frame they're aligned to
- Watch how ribosomes accumulate at strong start codons and how some scan past weak ones
- Use simulation speed controls (1x, 2x, 5x, 10x) to adjust animation
Working with Translons
Adding Translation Events
Translons are added automatically based on detected start codons, but you can also add them manually:
- Click anywhere on the canvas to add a translon starting at that position
- The tool finds the nearest downstream stop codon in the same frame
- Initial probability is set based on Kozak sequence context
- Fine-tune probability using the slider in the translon controls panel
Translon Details
Click on any translon card to expand and view detailed information:
- Position: Start and stop coordinates in the mRNA
- Frame: Reading frame (0, 1, or 2)
- ORF Length: Size of the open reading frame
- Start Codon: Actual codon sequence (AUG or near-cognate)
- Kozak Context: Flanking sequence and quality score
- Strong (โฅ0.7): Efficient initiation expected
- Moderate (0.4-0.7): Context-dependent initiation
- Weak (<0.4): Leaky scanning likely
- GC Content: Local RNA structure indicator (high GC = potential structure)
Adjusting Probabilities
Each translon has an adjustable initiation probability (0-99%):
- Move the slider to change probability
- The RDG updates in real-time to reflect your changes
- Probabilities affect downstream translons - if 80% initiate at T1, only 20% are available for T2
- Use this to test different biological scenarios and parameter sensitivities
Understanding the RDG Model
Translation Probability Calculation
The model calculates initiation probability at each start codon:
Pinit = Pbase ร fKozak ร fcodon ร fstructure
- Pbase (Base Probability): Fundamental initiation rate (default 0.3, adjustable in RDG Parameters)
- fKozak (Kozak Factor): Context strength, 0.5-1.0 based on positions -3 and +4
- Optimal context:
gccA/GccAUGG
- Position -3 = A or G: +0.5
- Position +4 = G: +0.5
- fcodon (Codon Factor): Codon type efficiency
- AUG: 1.0 (100% efficiency)
- CUG, GUG, ACG, etc.: 0.1 (10% efficiency, adjustable via "Near-Cognate Penalty")
- fstructure (Structure Factor): Local secondary structure effect
- Based on GC content in 20nt window
- High GC (>50%) increases probability (structure may pause ribosomes)
- Effect controlled by "GC Bonus" parameter
Adjustable Parameters
Fine-tune the model using the RDG Parameters section:
- Base P (0.1-0.5): Baseline initiation probability - increase for generally higher initiation rates
- Near-Cognate Penalty (0.05-0.3): Non-AUG start efficiency - higher values make CUG/GUG more competitive
- GC Bonus (0.0-1.0): Structure-mediated enhancement - higher values make GC-rich regions favor initiation more
Click "Apply to All Translons" after adjusting parameters to recalculate all probabilities with your new settings.
Reinitiation Model
After translating a short upstream ORF (uORF), ribosomes may reinitiate downstream:
- Reinitiation is more likely after short uORFs (<50 codons)
- Probability decreases with uORF length (ribosomes lose factors during long translation)
- The model shows reinitiation paths as branches continuing from stop codons
- This explains how genes like ATF4 and GCN4 achieve stress-dependent upregulation
Predicted Protein Products
The Products panel shows all predicted proteins with:
- Name: Translon identifier (T1, T2, etc.)
- Length: Protein length in amino acids (ORF nt รท 3)
- MW: Molecular weight in kDa (assuming 110 Da average per residue)
- Abundance: Relative amount (%) based on initiation probability
- Accounts for sequential depletion (if 80% initiate at T1, max 20% available for T2)
- Normalized so total across all products = 100%
Interactive Features
Simulation Controls
- Play/Pause: Start/stop the ribosome scanning simulation (flux view)
- Clear: Remove all ribosomes from the simulation
- Speed Buttons: Adjust animation speed (1x, 2x, 5x, 10x)
Naming & Export
- Name This Analysis: Save your current RDG with a descriptive name for reference
- Export as Video: Record the canvas as a WebM video file (works best in flux view to capture animation)
Practical Applications
Studying Gene Regulation
- uORF-Mediated Regulation: Analyze genes like ATF4, ATF5, or GCN4 to understand how upstream ORFs control downstream translation
- Leaky Scanning: Compare strong vs. weak Kozak contexts to predict bypass efficiency
- Alternative Isoforms: Identify potential N-terminal variants from downstream start codons
- Near-Cognate Initiation: Detect non-AUG start sites that may produce functional proteins
Optimizing Constructs
- Reporter Design: Predict how 5'UTR sequences affect downstream reporter expression
- Expression Tuning: Test different Kozak sequences to optimize protein production
- Background Reduction: Identify and eliminate unintended start codons
Best Practices & Tips
- Start Simple: Load a well-characterized gene (e.g., ATF4) to understand the interface before analyzing your own sequences
- Focus on 5'UTR: Most alternative initiation happens in the first few hundred nucleotides - trimming sequences to ~1500 nt is usually sufficient
- Check Kozak Scores: Strong contexts (>0.7) are efficient initiation sites; weak contexts (<0.4) often permit leaky scanning
- Use Both Views: Tree view for understanding decision logic, flux view for visualizing ribosome dynamics
- Test Parameters: Adjust base probability and penalties to see how sensitive your predictions are to model assumptions
- Compare Variants: Load wild-type sequence, adjust one parameter or remove one uORF, and compare products
- Name Your Work: Use descriptive names (e.g., "ATF4 WT", "ATF4 uORF1-mut") when saving analyses
Limitations & Considerations
- Simplified Model: Real translation involves many factors not captured here including:
- RNA secondary structure (only approximated via GC content)
- RNA-binding proteins and upstream ORF peptide effects
- Ribosome availability and initiation factor concentrations
- Context-specific regulation (cell type, stress conditions)
- No Stop Readthrough: Model assumes all stop codons terminate translation (readthrough/suppression not implemented)
- No Frameshifting: Programmed ribosomal frameshifts are not currently supported
- Static Probabilities: Initiation probabilities are based on sequence alone, not dynamic cellular conditions
- Prediction Tool: Results are hypothetical - experimental validation (e.g., ribosome profiling, western blots) is essential
Example Workflows
Workflow 1: Analyzing ATF4 uORF Regulation
- Select "Gene Symbol" and enter "ATF4"
- Click "Fetch Sequence" - loads canonical transcript 5'UTR
- Switch to Tree View to see the complete RDG
- Observe two main uORFs (uORF1 and uORF2) before the main ORF
- Note that uORF1 is short - ribosomes likely reinitiate after translating it
- uORF2 overlaps with the main ORF start - creates competing initiation
- Under normal conditions (adjust base P to ~0.3), uORF2 captures most ribosomes, limiting main ORF translation
- Simulate stress (increase base P to ~0.4-0.5) - more ribosomes initiate at uORF1, fewer reach uORF2, more reinitiate at main ORF
- This explains ATF4's paradoxical stress-induced upregulation
Workflow 2: Optimizing a Reporter Construct
- Paste your 5'UTR + reporter ORF sequence
- Review all detected start codons in the sequence
- Identify any upstream start codons that might compete with your reporter
- Check their Kozak scores - strong contexts (>0.7) may significantly reduce reporter expression
- Consider mutating problematic uORFs (change AUG to AAG) or adjusting Kozak context
- Use the abundance predictions to estimate reporter yield
- For dual luciferase reporters, see the Western Blot demo for protein product predictions
Technical Details
Reading Frame Color Coding
- Frame 0 (Green): Positions 0, 3, 6, 9... (divisible by 3)
- Frame 1 (Blue): Positions 1, 4, 7, 10... (remainder 1 when divided by 3)
- Frame 2 (Orange): Positions 2, 5, 8, 11... (remainder 2 when divided by 3)
Start Codon Detection
The tool automatically detects:
- AUG: Standard methionine start codon
- Near-cognates: CUG (Leu), GUG (Val), ACG (Thr), AUU (Ile), AUC (Ile), AUA (Ile) - all shown but with reduced efficiency
Abundance Calculation
Product abundance accounts for sequential ribosome depletion:
Available ribosomes = 100%
T1 initiates at 45% โ produces 45% T1, leaves 55% available
T2 initiates at 60% of remaining โ produces 33% T2 (0.55 ร 0.60), leaves 22% available
T3 initiates at 80% of remaining โ produces 17.6% T3 (0.22 ร 0.80), leaves 4.4% available
Shared Code Architecture
This tool uses modular JavaScript libraries:
- rdg-engine.js: Core calculation engine
- Start codon detection
- Kozak scoring
- Probability calculations
- Stop codon finding
- Protein product prediction
- rdg-viz.js: Visualization layer
- Tree layout algorithm
- Canvas rendering
- Construct suggestions
These modules are shared with the Western Blot demo for consistent behavior and easier maintenance.
Related Tools
- Western Blot Demo: Use the same RDG model to predict protein products from reporter constructs and visualize them as gel bands
- Ribosome Profiling: Compare RDG predictions with experimental ribosome profiling data to validate and refine the model
Further Reading
- Kozak M. (1987) An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs. Nucleic Acids Res
- Kozak M. (2002) Pushing the limits of the scanning mechanism for initiation of translation. Gene
- Hinnebusch AG. (2011) Molecular mechanism of scanning and start codon selection in eukaryotes. Microbiol Mol Biol Rev
- Ingolia NT et al. (2009) Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science
- Vattem KM & Wek RC. (2004) Reinitiation involving upstream ORFs regulates ATF4 mRNA translation in mammalian cells. PNAS
- Young SK & Wek RC. (2016) Upstream open reading frames differentially regulate gene-specific translation in the integrated stress response. J Biol Chem
Glossary
- Translon: A translation event producing a specific protein isoform from a particular start-stop codon pair
- uORF (upstream ORF): Short open reading frame in the 5'UTR that can regulate downstream translation
- Kozak Sequence: Nucleotide context surrounding a start codon that affects initiation efficiency
- Leaky Scanning: Process where ribosomes bypass a start codon and continue scanning downstream
- Reinitiation: Resumption of scanning and translation after terminating a short uORF
- Near-Cognate Codon: Non-AUG codon (e.g., CUG, GUG) that can initiate translation at reduced efficiency
- Reading Frame: One of three possible ways to read a sequence as triplet codons (frames 0, 1, 2)
- ORF (Open Reading Frame): Sequence from a start codon to a stop codon without internal stops