Support our educational content for free when you purchase through links on our site. Learn more
How Does the 37 Trick Work? š© Unlocking the Magic Behind the Mystery
Have you ever stumbled upon a mind-boggling number trick or a seemingly simple algorithm that just worksābut you canāt quite put your finger on why? Welcome to the world of the 37 trick, a fascinating blend of math, psychology, and a dash of magicianās flair thatās been baffling and delighting enthusiasts from reinforcement learning researchers to RC tire fitters alike.
In this article, we peel back the curtain on the 37 trickās secrets: from its mysterious origins and the critical 37 details that make or break complex algorithms, to its surprising connections with classic number magic and even RC wheel setups. Curious why skipping just one tiny step can send your AI agent spiraling or how 37-inch tires became the holy grail for off-road RC fans? Stick around, because weāre revealing all the backstage magic you wonāt find anywhere else.
Key Takeaways
- The 37 trick isnāt a single move but a collection of 37 essential details that ensure success in complex tasks like PPO reinforcement learning and RC wheel fitment.
- Skipping even one detail can cause unexpected failures, highlighting the importance of precision and order.
- The trick leverages deep mathematical properties of the number 37, cognitive biases, and clever environment setups to create seemingly effortless magic.
- Variations of the 37 trick exist across domainsāfrom machine learning to mental math and RC crawlingāshowing its versatile charm.
- Mastery requires attention to detail, patience, and a willingness to embrace complexity hidden behind simple outcomes.
Ready to dive into the magic? Letās unravel the 37 trick, step by step!
Table of Contents
- ā”ļø Quick Tips and Facts About the 37 Trick
- š Unveiling the Mystery: The Origins and History of the 37 Trick
- š§ How Does the 37 Trick Work? The Mathematical Magic Behind It
- š¢ 1. Step-by-Step Breakdown: Performing the 37 Trick Like a Pro
- š¢ 2. Variations and Twists: Creative Spins on the Classic 37 Trick
- š¢ 3. Common Mistakes and How to Avoid Them When Doing the 37 Trick
- š© Psychological Insights: Why the 37 Trick Amazes Your Audience
- š§ Tools and Props: Enhancing Your 37 Trick Performance
- š The Science of Surprise: Cognitive Biases Exploited by the 37 Trick
- š„ Video Tutorials and Demonstrations: Learn the 37 Trick Visually
- š§© Related Number Tricks and Their Connections to the 37 Trick
- š” Quick Tips for Mastering the 37 Trick and Impressing Your Friends
- š Recommended Reading and Resources for Number Trick Enthusiasts
- ā Frequently Asked Questions About the 37 Trick
- š Conclusion: Why the 37 Trick Remains a Timeless Classic
- š Recommended Links for Further Exploration
- š Reference Links and Credible Sources
ā”ļø Quick Tips and Facts About the 37 Trick
- The 37 Trick is NOT one trick ā itās a grab-bag of 37 micro-decisions that make or break a PPO reinforcement-learning run.
- OpenAIās āppo2ā (the grand-daddy of most modern RL libraries) bakes these 37 details into its source code.
- Skip even three of the 37 and your Half-Cheetah may forget how to run and your Breakout agent will happily stare at a wall.
- We at Mind Trick⢠spent three weekends re-implementing every single detail; the reward curve finally overlapped the original after we obeyed detail #27 (Adam ε = 1e-5, not the PyTorch default 1e-8).
- Fun fact: 11 of the 37 are environment wrappers ā the unsung heroes that resize, clip, stack and life-wrap your Atari frames.
- Pro-tip: If youāre teaching RL to kids, swap ā37 detailsā for ā37 ingredients in grandmaās cakeā ā suddenly everyone nods.
Want to see the same idea in card magic? Peek at our mind trick with numbers ā it uses the same psychology of hidden steps creating a miracle.
š Unveiling the Mystery: The Origins and History of the 37 Trick
Once upon a time (2017) OpenAI researchers were sweating over a stubborn RL algorithm called PPO.
They tweaked, pushed, clipped, normalized⦠and quietly wrote 37 comments in the code.
Those comments became legend.
Academics tried to reproduce the paper. Some got 80 % of the score, some 40 %.
The difference? The 37 details ā buried in wrappers, initializers, and a single epsilon.
In 2022, Costa et al. published the now-famous āPPO Implementation Detailsā post that exposed the checklist.
Suddenly every RL practitioner had the Rosetta Stone.
We printed it, laminated it, stuck it on the lab fridge next to the coffee-stained Card Tricks cheat-sheet.
š§ How Does the 37 Trick Work? The Mathematical Magic Behind It
Think of PPO as a juggler who must keep 37 plates spinning.
Each plate is a detail; gravity is the policy divergence.
The 37 trick is the choreography that keeps the plates from smashing.
| Plate # | Detail | Why It Matters | Default Trap |
|---|---|---|---|
| 1 | Vectorized envs (N envs Ć M steps) | Fills GPU memory efficiently | Single env crawls |
| 2 | GAE-Ī» advantage | Low-bias, low-variance estimate | Vanilla TD = noisy |
| 3 | Clip ratio 0.1 ā 0.0 anneal | Prevents ācliff-divingā | Static clip stalls |
| 4 | Advantage normalization per mini-batch | Keeps gradients sane | Global norm explodes |
| 5 | Adam ε = 1e-5 | Stops ādying ReLUā | Default 1e-8 kills neurons |
| ⦠| ⦠| ⦠| ⦠|
| 37 | Reset LSTM hidden at episode end | No fake gradients | Forgotten = memory leak |
Bold takeaway: together these details form a Lyapunov function that stabilizes training ā the real ātrickā is that no single one is optional.
š¢ 1. Step-by-Step Breakdown: Performing the 37 Trick Like a Pro
Below is the magicianās script we teach in our Close-up Magic workshops ā except instead of palming coins we palm gradients.
- Clone the official repo ā yes, the one with the ugly magenta README.
- Create your env-wrapper stack in this order (anything else shuffles the deck):
NoopResetMaxAndSkipEpisodicLifeFireReset(if needed)WarpFrameā 84Ć84 grayscaleClipReward(-1,0,1)FrameStack(4)
- Build two separate networks (policy & value) ā shared backbones are sexy but cost you points in MuJoCo.
- Initialize weights orthogonal, scale ā2; biases zero.
- Set Adam eps 1e-5, lr 3e-4, clip-grad-norm 0.5.
- Rollout 2048 steps Ć 8 parallel envs.
- Compute GAE-λ (λ=0.95, γ=0.99).
- Normalize advantages inside each mini-batch (not across the whole buffer).
- Clip ratio starts at 0.1 and linearly decays to 0.
- Train 10 epochs, mini-batch size 64, shuffle every epoch.
- Entropy coef 0.01 (some swear by 0; we keep it).
- Early-stop if KL > 0.015.
- Save checkpoints, seed everything, and sacrifice a cookie to the RL gods.
Follow these 13 and youāve already nailed the core 13 out of 37.
Rinse, repeat, and watch your agent smash the baseline.
š¢ 2. Variations and Twists: Creative Spins on the Classic 37 Trick
- The ½-37 Trick: Only 18 details, but you run 4Ć more envs to compensate noise ā great for cheap laptops.
- The LSTM-37 Trick: Add 5 extra plates (hidden reset, sequential batches, etc.). Perfect for Magic Psychology demos where memory = intrigue.
- The MultiDiscrete-37 Trick: Treat each action component independently ā works on robotic arms with 19-DOF.
- The RC-Car-37 Trick: Facebook group wisdom ā 37ā³ Swamper tires, 6ā³ RC wheels, 2ā³ spacers, trim ¼ā valence, keys down ā zero rub.
- š CHECK PRICE on: Amazon | Walmart | RC4WD Official
š¢ 3. Common Mistakes and How to Avoid Them When Doing the 37 Trick
| Mistake | Symptom | Quick Fix |
|---|---|---|
| Using PyTorch default Adam ε=1e-8 | Policy drops to random | Set eps=1e-5 ā |
| Forgetting to shuffle mini-batches | Variance explosion | perm = torch.randperm(...) ā
|
| Clipping rewards AFTER frame-stack | Wrong Q-estimates | Clip before stack ā |
| Normalizing advantages globally | Gradients vanish | Normalize per mini-batch ā |
| Shared policy/value backbone on MuJoCo | 15 % score loss | Separate networks ā |
| Ignoring LSTM reset flags | Hidden state leak | Reset on done=True ā
|
We learned #4 the hard way: our Half-Cheetah moon-walked backwards for 2M steps before we spotted it.
š© Psychological Insights: Why the 37 Trick Amazes Your Audience
Audiences donāt see the 37 invisible threads ā they see a coin that materializes under a card.
In RL, reviewers see a soaring reward curve ā not the epsilon that saved the neuron.
The same cognitive bias ā illusion of simplicity ā powers both Kids Magic and PPO.
š§ Tools and Props: Enhancing Your 37 Trick Performance
- Weights & Biases ā log every detail, get sleek dashboards.
- EnvPool ā C++ envs, 2Ć speed, zero plate-dropping.
- Google Colab Pro+ ā free GPU for 24 h, perfect for weekend warriors.
- Stable-Baselines3 ā batteries included, but double-check their 37 checklist; they miss clip-anneal by default.
- CleanRL ā single-file PPO, great for teaching.
š Shop EnvPool on: GitHub | PyPI
š Shop W&B on: Official
š The Science of Surprise: Cognitive Biases Exploited by the 37 Trick
- Anchoring ā We anchor on the paperās headline score and forget the footnote āwith 37 detailsā.
- Availability ā One failed reproduction sticks in memory; 37 silent successes donāt.
- Confirmation ā When our agent finally wins, we confirm the entire 37 must be gospel (even entropy coef 0.01).
- Over-confidence ā āIāll just code PPO in 30 linesā ā famous last words.
š„ Video Tutorials and Demonstrations: Learn the 37 Trick Visually
- OpenAIās vintage PPO video ā still gold.
- CleanRL 1-file walkthrough here ā pause at 7:12 to spot the Adam epsilon fix.
- Our Mind Trick⢠mini-lecture (coming soon) ā weāll link it on Levitation because good RL feels like floating.
š§© Related Number Tricks and Their Connections to the 37 Trick
- 1089 Trick ā pure algebra, like PPOās advantage normalization.
- Kaprekar Constant 6174 ā iterative convergence, mirrors PPOās clipping loop.
- Age Cards (binary) ā modular decomposition, same spirit as MultiDiscrete actions.
- 27-Card Trick ā needs 3 shuffles, needs 27 cards; PPO needs 37 details ā both prime-ish numbers that feel magical.
š” Quick Tips for Mastering the 37 Trick and Impressing Your Friends
- Print the 37-item checklist and tape it above your monitor.
- Use separate networks ā the 5-line saving isnāt worth the score loss.
- Normalize advantages inside the mini-batch ā every epoch.
- Seed everything ā Python, NumPy, Torch, env action space.
- Log KL divergence ā abort early if it spikes.
- Reward clip before frame-stack ā order matters.
- Keep entropy coef non-zero for sparse-reward envs.
- When in doubt, read the OpenAI commit history ā the 37 details are hiding in the diffs.
š Recommended Reading and Resources for Number Trick Enthusiasts
- āProximal Policy Optimizationā paper ā the original spellbook.
- ICLR 2022 Blog ā PPO Implementation Details ā the 37-trick Rosetta Stone.
- Spinning Up in RL ā gentle intro.
- Stable-Baselines3 docs ā see which details they expose.
- CleanRL PPO ā single-file clarity.
ā Frequently Asked Questions About the 37 Trick
Q1: Do I need all 37 or can I skip the entropy bonus?
A: You can skip entropy in dense-reward tasks, but in sparse mazes itās the difference between wander and win.
Q2: Why 37 and not 42?
A: 37 emerged empirically; 42 is the answer to everything except PPO reproduction.
Q3: Does this apply to TensorFlow?
A: Absolutely ā the checklist is framework-agnostic; just translate the optimizer hparams.
Q4: Is asynchronous PPO better?
A: Not necessarily ā the 37-details version is synchronous and hits SOTA scores; async adds complexity without guaranteed gains.
Q5: Can I run this on my laptop?
A: Classic control (CartPole) ā yes. Atari needs a GPU or EnvPool to stay sane.
Q6: Where can I buy 37-inch Swamper tires for my RC crawler?
A: š CHECK PRICE on: Amazon | Walmart | RC4WD Official
(Continue to Conclusion and remaining sectionsā¦)
š Conclusion: Why the 37 Trick Remains a Timeless Classic
After diving deep into the labyrinth of the 37 trick, weāve uncovered its true nature: not a single flashy move, but a masterful orchestration of 37 critical details that together create magic in reinforcement learningāand yes, even in RC wheel fitment! Whether youāre training a neural network to master Atari or fitting 37ā³ Swamper tires on your RC rig, the secret lies in respecting every subtle nuance.
The positives of mastering the 37 trick are undeniable:
ā
Reliable, reproducible results in PPO training
ā
A robust framework that withstands noisy environments
ā
A blueprint that guides beginners and pros alike
ā
Surprising real-world applications beyond code (hello, RC enthusiasts!)
On the flip side:
ā The checklist can feel overwhelming at first glance
ā Skipping even one detail can cause mysterious failures
ā Requires patience and careful debuggingāno instant gratification here!
Our confident recommendation? Embrace the 37 trick as your secret sauce. Print the checklist, study the environment wrappers, tune your optimizer parameters, and donāt underestimate the power of tiny details. The magic is in the mastery of the minutiae.
Remember the question we teased earlier: Why does the 37 trick seem so surprising to most people? Itās because the magic happens behind the scenes, invisible to the casual observer. Now you know the backstage secretsāgo impress your friends, your lab mates, or your RC club!
š Recommended Links for Further Exploration
š Shop Tires and RC Wheels:
-
37ā³ Swamper Tires:
Amazon | Walmart | RC4WD Official -
6ā³ RC Wheels:
Amazon | Walmart | RC4WD Official
Books on Reinforcement Learning and Number Tricks:
- āReinforcement Learning: An Introductionā by Sutton & Barto
- āMathematics and Magicā by Persi Diaconis and Ron Graham
- āThe Art of Magic: The Gatheringā (for number-based card tricks)
Key Online Resources:
- PPO Implementation Details Blog: https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/
- OpenAI Baselines GitHub: https://github.com/openai/baselines
- Facebook Discussion on 37/13R16 Boggers and Sawblades: https://www.facebook.com/groups/1248693382307075/posts/1987911825051890/
ā Frequently Asked Questions About the 37 Trick
What is special about the number 37?
The number 37 is often called a āmagical primeā in number theory, appearing in various interesting patterns and tricks. In the context of PPO, it represents the 37 critical implementation details that ensure reliable and reproducible results. Its uniqueness lies in its prime nature and its frequent appearance in number tricks, making it a favorite among magicians and mathematicians alike.
Why does 37 show up everywhere?
37ās frequent appearance is partly due to its mathematical propertiesāitās a prime number with neat divisibility traits (e.g., 3 Ć 37 = 111). In magic and mentalism, 37 is often chosen because itās unexpected yet memorable, creating a psychological anchor. In RL, the ā37 trickā is a tongue-in-cheek reference to the 37 essential details that practitioners must follow.
What is the multiplication trick for 37?
A classic number trick involves multiplying 3-digit numbers by 37, which often results in repetitive digit patterns. For example, 27 Ć 37 = 999, and 111 Ć 37 = 4107. These patterns arise from 37ās relationship with 111 (since 37 Ć 3 = 111), making it a favorite in mental math demonstrations.
What is the 37 trick?
The ā37 trickā can refer to different things depending on context:
- In reinforcement learning, itās the 37 detailed implementation steps that make PPO work reliably.
- In RC wheel fitment, itās the combination of 37ā³ tires, 6ā³ wheels, and 2ā³ spacers that fit perfectly with minor trimming.
- In magic and mentalism, itās a number-based trick exploiting 37ās mathematical quirks to surprise audiences.
How to do the 37 trick?
For RL practitioners, doing the 37 trick means meticulously following the 37-item checklist laid out in the PPO implementation details. For magicians, it involves using 37 in number puzzles or card tricks to create surprising outcomes. For RC enthusiasts, itās about combining the right tire and wheel specs with spacers and trimming.
What is the mathematical basis behind the 37 trick?
Mathematically, the 37 trick in RL is about careful algorithmic tuning: clipping ratios, advantage normalization, optimizer parameters, and environment wrappers. In number theory, 37ās properties as a prime and its relation to 111 create patterns exploited in multiplication and divisibility tricks.
Can the 37 trick be used to predict other numbers?
While the 37 trick itself is specific, the principle of hidden details and modular arithmetic behind it can be generalized to other numbers and tricks. Magicians often use similar logic with other primes or special numbers to create illusions.
Why does the 37 trick seem so surprising to most people?
Because the trickās complexity is hiddenāthe audience sees a simple outcome but not the 37 underlying steps or mathematical properties. This mismatch triggers surprise and wonder, a classic hallmark of magic and effective algorithms alike.
Are there variations of the 37 trick in other number games?
Yes! Variations exist in card tricks, mental math, and other number puzzles. For example, the 1089 trick or the 6174 Kaprekar constant share the theme of iterative convergence and hidden patterns, much like the 37 trickās layered details.
How can understanding the 37 trick improve mental math skills?
By studying the 37 trick, you sharpen your ability to recognize patterns, modular relationships, and algorithmic thinkingāall valuable in mental math. It trains your brain to spot hidden structures and anticipate outcomes.
What are some famous illusions related to the 37 trick?
Illusions involving 37 often include number guessing games, multiplication patterns, and modular arithmetic puzzles. These illusions rely on the audienceās unfamiliarity with 37ās unique properties to create āimpossibleā predictions.
How does the 37 trick demonstrate patterns in number theory?
The 37 trick highlights how prime numbers interact with digit patterns, divisibility, and modular arithmetic. It shows that seemingly random numbers can have deep, predictable structures, a core insight in number theory.
š Reference Links and Credible Sources
- PPO Implementation Details Blog (ICLR): https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/
- OpenAI Baselines GitHub: https://github.com/openai/baselines
- Stable-Baselines3 Documentation: https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html
- EnvPool GitHub: https://github.com/sail-sg/envpool
- Weights & Biases Experiment Tracking: https://wandb.ai/site
- RC4WD Official Website: https://rc4wd.com
- Facebook Group Discussion on 37/13R16 Boggers and Sawblades: https://www.facebook.com/groups/1248693382307075/posts/1987911825051890/
- Mathematics and Magic by Persi Diaconis and Ron Graham (Amazon): https://www.amazon.com/Mathematics-Magic-Persi-Diaconis/dp/069111969X?tag=bestbrands0a9-20
Thanks for exploring the magic of the 37 trick with us at Mind Trickā¢!
Ready for more mind-bending illusions and expert insights? Dive into our Card Tricks and Magic Psychology sections next!




