The goal of the Kinetics dataset is to help the computer vision and machine learning communities advance models for video understanding. Given this large human action classification dataset, it may be possible to learn powerful video representations that transfer to different video tasks.
The Kinetics-700-2020 dataset will be used for this challenge. Kinetics-700-2020 is a large-scale, high-quality dataset of YouTube video URLs which include a diverse range of human focused actions. The aim of the Kinetics dataset is to help the machine learning community create more advanced models for video understanding. It is an approximate super-set of both Kinetics-400, released in 2017, Kinetics-600, released in 2018 and Kinetics-700, released in 2019.
The dataset consists of approximately 650,000 video clips, and covers 700 human action classes with at least 700 video clips for each action class. Each clip lasts around 10 seconds and is labeled with a single class. All of the clips have been through multiple rounds of human annotation, and each is taken from a unique YouTube video. The actions cover a broad range of classes including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands and hugging.
More information about how to download the Kinetics dataset is available here.
: Emerging creators like Iben M.A. have gained massive traction through high-concept challenges—such as comparing a normal Tempe Mendoan to a "luxury" version priced at millions of Rupiah. Cinema Resurgence: The Golden Age of Local Film
: In early 2026, the comedy film Agak Laen: Menyala Pantiku made history by becoming the highest-grossing Indonesian film of all time, surpassing 10.25 million admissions and dethroning the 2025 animated hit Jumbo . : Emerging creators like Iben M
By 2026, local productions account for approximately 67% of the Indonesian market share. The industry has moved beyond budget horror to high-concept blockbusters. By 2026, local productions account for approximately 67%
: The genre remains a pillar of the industry. Joko Anwar’s Ghost in the Cell recently became his seventh film to reach the 1-million-viewer milestone. Other major 2026 hits include Alas Roban and Danur: The Last Chapter , both of which crossed the 2-million-admission mark within weeks of release. Joko Anwar’s Ghost in the Cell recently became
The Indonesian entertainment landscape in 2026 is a powerhouse of domestic creativity, where local films consistently outpace Hollywood imports and a new class of digital creators commands audiences of over 50 million. From high-concept horror to viral podcasts that shape national discourse, Indonesian content has transitioned from regional popularity to a sophisticated digital ecosystem.
: The Indonesian Film Agency (BPI) is aggressively pushing domestic talent onto the world stage, with a dedicated delegation scheduled for the Cannes Film Festival to secure more international co-productions. Indonesiansong - YouTube Music
Indonesia currently ranks as one of the world's most active social media markets, with over 140 million users on YouTube alone. This scale has birthed mega-influencers whose reach rivals traditional television networks.
1. Possible to use ImageNet checkpoints?
We allow finetuning from public ImageNet checkpoints for the supervised track -- but a link to the specific checkpoint should be provided with each submission.
2. Possible to use optical flow?
Flow can be used as long as not trained on external datasets, except if they are synthetic.
3. Can we train on test data without labels (e.g. transductive)?
No.
4. Can we use semantic class label information?
Yes, for the supervised track.
5. Will there be special tracks for methods using fewer FLOPs / small models or just RGB vs RGB+Audio in the self-supervised track?
We will ask participants to provide the total number of model parameters and the modalities used and plan to create special mentions for those doing well in each setting, but not specific tracks.