Evaluation and Metrics
Evaluating agent performance is crucial to understand how well they are learning.
๐ Main Metricsโ
Total Rewardโ
The sum of all rewards received in an episode.
Average Rewardโ
The total reward divided by the number of steps.
Success Rateโ
The percentage of episodes that end successfully.
๐ Learning Curvesโ
Reward vs Episodesโ
- X-axis - Number of episodes
- Y-axis - Average reward
- Trend - Should increase over time
Example in Ants Sagaโ
- Episode 1-100 - Average reward: -5
- Episode 100-500 - Average reward: 0
- Episode 500+ - Average reward: 10+
๐ฏ Key Performance Indicatorsโ
Convergenceโ
When the agent's performance stabilizes and stops improving.
Sample Efficiencyโ
How many samples (interactions) the agent needs to learn.
Stabilityโ
How consistent the agent's performance is across different runs.
๐ Visualization Toolsโ
TensorBoardโ
- Real-time monitoring
- Multiple metrics
- Comparison between runs
Custom Plotsโ
- Learning curves
- Reward distribution
- Action frequency
๐ Debugging Performanceโ
Common Issuesโ
- No learning - Check reward function
- Unstable - Reduce learning rate
- Slow convergence - Increase exploration
Best Practicesโ
- Monitor multiple metrics
- Use multiple random seeds
- Compare with baselines