Topic Brief: Modified reward function: 1.0 * ( -abs(Pole Angle) + 0.21 ) Batch size: 16 Structure shape: [128128128] -- this is overkill. A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track.
Cartpole Deep Q Learning -
Modified reward function: 1.0 * ( -abs(Pole Angle) + 0.21 ) Batch size: 16 Structure shape: [128128128] -- this is overkill. A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. This tutorial contains step by step explanation, code walkthru, and demo of how
Important details found
- Modified reward function: 1.0 * ( -abs(Pole Angle) + 0.21 ) Batch size: 16 Structure shape: [128128128] -- this is overkill.
- A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track.
- This tutorial contains step by step explanation, code walkthru, and demo of how
- Balancing a typical inverted pendulum with Temporal Difference methods.
Why this topic is useful
The goal of this page is to make Cartpole Deep Q Learning easier to scan, compare, and understand before opening related resources.
Frequently Asked Questions
What should readers check next?
Readers should check related pages, official references, or updated sources when details matter.
Why are related topics included?
Related topics help readers compare nearby references and understand the broader subject.
What is this page about?
This page summarizes Cartpole Deep Q Learning and connects it with related entries, references, and supporting context.