hop hop hop Perhaps you've seen my earlier publish about fixing CartPole-v1 with simply bitwise ops. I've tried to scale this strategy to more durable environments, but it surely didn't get me too far. Nevertheless, I used to be impressed by completely unrelated article – Eigenvalues as fashions. Whereas the writer is speaking about matrices of dimension 3×3 and bigger I went the opposite manner – I restricted the load matrix to be diagonal. This implies the eigenvalues are merely the vector parts themselves. To get the utmost or minimal eigenvalue we actually simply take the Now we are able to outline a operate The place Should you learn the "Eigenvalues as fashions" article you already know that we are able to take For the reason that concave operate is definitely a convex one with flipped signal we are able to outline the DC operate which is a distinction of two convex features and it seems it might approximate quite a lot of features. So in our case it’s truly a sum: This provides us scalar again and so long as the variety of eigenvalues is greater than 2 (3,4,…) this operate is non-linear and given sufficient eigenvalues we’ve fairly highly effective approximator! (when there are solely 2 eigenvalues then the operate collapses to only a sum of these 2 eigenvalues = linear) We will simply lengthen it to high-dimensional inputs: Nevertheless, if The Double Effectively Potential with sharp resolution boundary The one drawback is that the Now my unfastened interpretation of the So for the BipedalWalker-v3 drawback I needed to do the only factor potential. Since we’ve now "fairly highly effective" neuron, I simply assigned 4 separate neurons controlling every joint independently. I educated them immediately with PPO and by some means they’ve learnt to synchronize with none bodily hyperlink between them. I've used 6 eigenvalues for every neuron and distilled the coverage right down to 69 strains of python code which you’ll simply copy-paste and run when you’ve got gymnasium and numpy put in. The whole logic for "hopping"/"strolling" is actually right here: This could get you common rating of about 310 which is taken into account "solved" for this atmosphere. Whereas it's now not simply "bitwise ops" like in CartPole-v1 case I feel it shares the identical spirit. === EDIT === I simply realized you’ll be able to set all of the === EDIT 2 === Nevertheless after second thought whether or not you’ll be able to simply drop the === EDIT 3 === submitted by /u/kiockete |
Subscribe to Updates
Get the latest tech insights from TechnologiesDigest.com on AI, innovation, and the future of digital technology.
Trending
- AWS Interconnect is now typically accessible, with a brand new choice to simplify last-mile connectivity
- Amazon to Purchase Globalstar to Strengthen Its LEO Satellite tv for pc Community
- Google DeepMind Releases Gemini Robotics-ER 1.6: Bringing Enhanced Embodied Reasoning and Instrument Studying to Bodily AI
- Trump Is Internet hosting One other Meme Coin Gala—The Value for VIP Entry Is Down 90%
- Microsoft April 2026 Patch Tuesday fixes 167 flaws, 2 zero-days
- DualGPT-AB: a dual-stage generative optimization framework for therapeutic antibody design
- Optimize object storage prices mechanically with good tier—now typically obtainable
- I swapped my Sony WH-1000XM6 for lower-end JBL headphones, and so they nonetheless sounded nice



![[P] I solved BipedalWalker-v3 (~310 rating) with eigenvalues. The whole coverage matches on this publish. [P] I solved BipedalWalker-v3 (~310 score) with eigenvalues. The entire policy fits in this post.](https://technologiesdigest.com/wp-content/uploads/2026/01/P-I-solved-BipedalWalker-v3-310-score-with-eigenvalues-The-entire.jpg)