Leaderboard Detailed Leaderboard Mean scores across all 19 tasks. The two scenarios are: 1000 examples per task, and full datasets. All cells report top-1 accuracy. Breakdown Across Tasks Select datasets to include: 1000 Training Examples per task Full Training Sets