112,183ํธ์ ์ปดํจํฐ๋น์ ยท๋จธ์ ๋ฌ๋ ๋
ผ๋ฌธ์ ์๋ฌผ ๊ณํต๋์ฒ๋ผ ๋ถ๋ฅํ ๊ฒฐ๊ณผ๋ฌผ
(CVPR / NeurIPS / ICML / ICCV / ICLR / ECCV / 3DV, 1987~2025)
16 Phylum ร ~120 Class ร ~400 Order ร Genus 4-depth ์๋งจํฑ ๊ณํต๋๋ก ๋ถ๋ฅํ EDA ๊ฒฐ๊ณผ.
Object Detection, Generative Models, 3D Vision
Object Detection โธ Anchor-free Detection
Anchor-free Detection โธ Query-based Detection
Query-based Detection โธ DETR variants
X = 1987~2025, Y = ์ฐ๊ฐ ๋ ผ๋ฌธ ์ (stacked). ์ = 16๊ฐ Phylum + Editorial + Unclassified. ํธ๋ฒ ์ ์ ํํ ์นด์ดํธ ํ์.
16 Phylum ๊ฐ๊ฐ์ ๋ํด ํจ๋ 1๊ฐ. ๊ฐ ํจ๋ ์์์ top 8 Class๊ฐ stacked area. ํจ๋๋ง๋ค Phylum ์์ผ๋ก ์ ๋ชฉ ํ์.
ํ = ๋ชจ๋ Class (Phylum ๊ทธ๋ฃน ์์), ์ด = 3๋ bucket. ์ = log10(1 + ๋ ผ๋ฌธ ์). ํธ๋ฒ ์ Class ์ด๋ฆ + ์ ํํ ์นด์ดํธ.
๊ฐ์ฅ ํฐ 12๊ฐ Class์ ๋ํด ๊ฐ๊ฐ mini panel. ๋ด๋ถ์์ top 6 Order๊ฐ stacked area. ํธ๋ฒ๋ก Order ์ด๋ฆ + ์นด์ดํธ ํ์ธ.
์๋ฌผ ๊ณํต๋ ํ์ด ์ฐจํธ. ์ฌ์ด๋๋ฐ ๊ฒ์์ผ๋ก ๋จ์ด๊ฐ ๋ค์ด๊ฐ wedge๋ค์ ํ๊บผ๋ฒ์ ํ์ด๋ผ์ดํธ. ํธ๋ฒ๋ก ๊ฐ์กฐ, ํด๋ฆญ์ผ๋ก ๊ณ์ด ์ ํ ํ โโโโ๋ก ํ์.
์ผ์ชฝ root์์ ์ค๋ฅธ์ชฝ์ผ๋ก ๊ฐ์ง์น๋ ์ ํ์ phylogenetic tree. ๋ ธ๋ ๋๋ ๋ผ๋ฒจ ํด๋ฆญ์ผ๋ก ํผ์น๊ธฐ/์ ๊ธฐ. ๊ธฐ๋ณธ์ Phylum๊น์ง๋ง ํผ์ณ์ ธ ์๊ณ ๋ ๊น์ด ๋ค์ด๊ฐ๋ ค๋ฉด ํด๋ฆญํ๊ฑฐ๋ ์ฌ์ด๋๋ฐ ๋ฒํผ ์ฌ์ฉ.
์ ์ฒด 112,183ํธ ์ค 16 Phylum ๊ฐ๊ฐ์ ์ฒซ ๋ฑ์ฅ ์ฐ๋, ํผํฌ ์ฐ๋/๋ ผ๋ฌธ ์, ๊ทธ๋ฆฌ๊ณ 2016-20 โ 2021-25 5๋ ์ฑ์ฅ๋ฅ .
| Phylum | Total | First | Peak | 2016-20 | 2021-25 | Growth |
|---|---|---|---|---|---|---|
| 1. Object Detection & Localization | 3,859 | 1987 | 2024 (416) | 936 | 1,695 | +81.1% |
| 2. Segmentation | 2,967 | 1988 | 2024 (383) | 675 | 1,414 | +109.5% |
| 3. 3D Vision & Reconstruction | 10,970 | 1987 | 2024 (1,567) | 2,031 | 5,405 | +166.1% |
| 4. Image Recognition & Retrieval | 8,090 | 1987 | 2024 (636) | 1,631 | 2,569 | +57.5% |
| 5. Video & Motion Understanding | 6,215 | 1987 | 2024 (603) | 1,440 | 2,557 | +77.6% |
| 6. Generative Models & Synthesis | 6,488 | 1988 | 2024 (1,683) | 1,275 | 4,928 | +286.5% |
| 7. Representation Learning | 6,619 | 1987 | 2023 (1,002) | 1,469 | 4,097 | +178.9% |
| 8. Vision-Language & Multimodal | 5,691 | 1988 | 2024 (1,901) | 529 | 5,014 | +847.8% |
| 9. Low-level Vision | 2,900 | 1988 | 2024 (358) | 784 | 1,515 | +93.2% |
| 10. Human-centric Vision | 4,506 | 1987 | 2024 (394) | 1,356 | 1,708 | +26.0% |
| 11. Deep Learning Architecture | 7,751 | 1987 | 2024 (1,135) | 2,281 | 4,588 | +101.1% |
| 12. Training Strategies | 11,404 | 1987 | 2024 (1,472) | 2,543 | 5,678 | +123.3% |
| 13. Optimization & Learning Theory | 6,819 | 1987 | 2024 (810) | 1,746 | 3,257 | +86.5% |
| 14. Reinforcement Learning & Decision Making | 5,634 | 1987 | 2024 (873) | 1,271 | 3,543 | +178.8% |
| 15. Efficient & Robust ML | 11,993 | 1987 | 2024 (1,666) | 2,596 | 6,553 | +152.4% |
| 16. Application Domains | 3,655 | 1987 | 2024 (631) | 693 | 2,186 | +215.4% |
์ต๊ทผ 5๋ ๊ฐ ๋ ผ๋ฌธ ์ ๊ธฐ์ค Top 10 Class.
| # | Phylum | Class | Papers |
|---|---|---|---|
| 1 | 12. Training Strategies | Training Techniques | 4,233 |
| 2 | 14. Reinforcement Learning & Decision Making | Reinforcement Learning | 3,621 |
| 3 | 8. Vision-Language & Multimodal | Language Model Applications | 3,016 |
| 4 | 13. Optimization & Learning Theory | Optimization Theory | 2,850 |
| 5 | 3. 3D Vision & Reconstruction | 3D Scene Understanding | 2,518 |
| 6 | 11. Deep Learning Architecture | General Deep Learning | 2,125 |
| 7 | 15. Efficient & Robust ML | Bayesian & Probabilistic Methods | 1,989 |
| 8 | 6. Generative Models & Synthesis | Diffusion Models | 1,893 |
| 9 | 3. 3D Vision & Reconstruction | Neural Implicit Representations | 1,459 |
| 10 | 2. Segmentation | Image Segmentation | 1,424 |
ํ๋ ํ๋ฐํ์ง๋ง ์ต๊ทผ ๊ฑฐ์ ์ฌ๋ผ์ง Class.
| Phylum > Class | Pre-2015 | Post-2020 | Retain |
|---|---|---|---|
| ๊ธฐ์ค(Pre-2015 โฅ 20ํธ & Post-2020 โค 10%)์ ๋ง์กฑํ๋ Class ์์ โ Class ๋จ์์์๋ 2015๋ ์ด์ ํ๋ฐํ๋ ๋ชจ๋ CV/ML ์์ญ์ด ์ง๊ธ๋ ์๋ฏธ ์๋ ์ถํ๋์ ์ ์ง. | |||
ํค์๋ ๋งค์นญ์ผ๋ก ๊ฐ ํซ ์นดํ ๊ณ ๋ฆฌ์ first paper year + ๋์ ์นด์ดํธ.
| Category | First year | First paper | Total | 2021-25 |
|---|---|---|---|---|
| Diffusion Models | 2020 | Improved Techniques for Training Score-Based Generative Models (NeurIPS) | 256 | 253 |
| Vision Transformer (ViT) | 2021 | Manipulation Detection in Satellite Images Using Vision Transformer (CVPR) | 497 | 497 |
| DETR / transformer detection | 2021 | UP-DETR: Unsupervised Pre-Training for Object Detection With Transformers (CVPR) | 48 | 48 |
| NeRF / Neural Radiance Field | 2020 | NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ECCV) | 358 | 357 |
| 3D Gaussian Splatting | 2024 | Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks (CVPR) | 290 | 290 |
| Vision-Language Models (CLIP) | 2021 | VinVL: Revisiting Visual Representations in Vision-Language Models (CVPR) | 599 | 599 |
| Segment Anything (SAM) | 2023 | Segment Anything (ICCV) | 70 | 70 |
| Foundation Models | 2022 | FETA: Towards Specializing Foundational Models for Expert Task Applications (NeurIPS) | 381 | 381 |
| Self-Supervised (modern) | 2021 | Masked autoencoder / SimCLR / MoCo / DINOv2 family | 135 | 135 |
| Mamba / State Space | 2024 | Vision Mamba / VMamba family | 63 | 63 |
| LoRA / PEFT | 2022 | Low-Rank Adaptation / Parameter-Efficient Fine-tuning family | 102 | 102 |
| ControlNet / Diffusion Edit | 2023 | InstructPix2Pix: Learning to Follow Image Editing Instructions (CVPR) | 23 | 23 |