You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Foundation AI for Spatial and Generative Intelligence
Building product-facing foundation AI systems for maps, mobility, generative assets, and interactive worlds.
AMAP-ML is an AI team at Alibaba AMAP. We connect research, engineering, and real-world deployment across spatial intelligence, generative intelligence, reasoning agents, reinforcement learning, multimodal models, world models, and product-scale AIGC systems.
Our work spans production systems, open-source projects, benchmarks, and publications at ICLR, CVPR, ACL, AAAI, SIGGRAPH, ICCV, ICML, EMNLP, ACM MM, and WWW. We release code and evaluation assets to help the community reproduce, compare, and extend our work.
Join us | Research interns, full-time researchers, and AI engineers in LLM agents, reinforcement learning, world models, multimodal learning, spatial intelligence, and generative AI are welcome to get in touch.
AMAP-ML builds foundation AI capabilities around two AMAP product anchors:
Product anchor
What it means
Spatial intelligence
AI agents and models that understand, reason, plan, and act in real-world map, mobility, urban, autonomous-driving, and local-service scenarios.
Generative intelligence
AIGC systems that create, edit, evaluate, and control visual assets, videos, 3D scenes, media content, and interactive experiences for map-native and local-service scenarios.
These product anchors are powered by a shared technical stack:
Capability
Role
Reasoning agents and reinforcement learning
Train agents and models to reason, use tools, plan, self-reflect, and improve through feedback.
Connect vision, language, maps, videos, GUIs, urban scenes, user intent, and product signals.
Generative modeling
Produce and edit visual assets with controllability, quality, spatial consistency, temporal coherence, and product usability.
Benchmarks and evaluation
Turn real AMAP scenarios into reusable public tasks, metrics, datasets, and reproducible evaluation suites.
Flagship Releases
These projects balance narrative fit with public adoption signals such as GitHub stars. They are organized around the two product anchors and the core AI systems that support them.
Agent-data mutual evolution for training LLM agents.
Agent RL / ACL 2026
Recent Updates
2026.05.18MobilityBench is accepted by KDD 2026 -- A Scalable Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios.
2026.05.12CoEvolve trains LLM agents through agent-data mutual evolution, using failure signals to synthesize harder tasks as the agent improves (ACL 2026).
2026.05.12Thinking-with-Map strengthens geolocalization with a reinforced parallel map-augmented reasoning agent (ACL 2026 Findings).
2026.05.11DreamX-World released the 5B-Cam model and inference code for interactive world simulation.
2026.04.22DCW mitigates SNR-t bias and improves diffusion generation quality across model families (CVPR 2026).
2026.04.22EMF extends efficient one-step generation from class-conditioned synthesis to text-conditioned image generation (CVPR 2026).
2026.04.10SkillClaw turns real interaction traces into reusable, evolving skill libraries.
2026.04.01MACE-Dance decouples motion generation and appearance synthesis for high-quality music-driven dance video (SIGGRAPH 2026).
2026.03.23Omni-WorldBench evaluates world models in dynamic 4D interactive settings.
2026.03.20AutoDrive-R2 improves VLA models with reasoning and self-reflection for autonomous driving scenarios (ICLR 2026).
2026.03.18Video-STAR uses tool-augmented reinforcement learning for open-vocabulary action recognition in video (ICLR 2026).
2026.03.11RL3DEdit uses geometry-guided reinforcement learning to make 3D scene edits more multi-view consistent (CVPR 2026).
Earlier Updates
2026.03.01FE2E transfers image-editing priors into dense depth and normal estimation (CVPR 2026).
2026.02.28FASA improves sparse decoding with frequency-aware attention (ICLR 2026).
2026.02.27Eevee provides high-resolution data and evaluation for video-based virtual try-on (CVPR 2026 Findings).
2026.02.06MobilityBench evaluates route-planning agents in real-world mobility scenarios (KDD 2026).
2026.02.06SpatialGenEval benchmarks spatial intelligence in text-to-image models (ICLR 2026).
2026.02.06Tree-GRPO replaces independent chain rollouts with tree-search rollouts for LLM agent reinforcement learning (ICLR 2026).
2026.02.04Code2World predicts GUI transitions through renderable code generation.
2026.02.04GPG provides a simple group policy gradient baseline for model reasoning (ICLR 2026).
2025.10.22Taming-Hallucinations reduces MLLM video hallucinations with counterfactual video generation.
2025.06.20FluxText provides a diffusion transformer baseline for scene-text editing.
Perception-aligned benchmark for video motion generation.
ICCV 2025
For Collaborators and Applicants
We are actively looking for people who enjoy building strong AI systems: clean code, reproducible experiments, rigorous evaluation, ambitious problem selection, and real-world product impact.
If you are interested in research internships, full-time roles, or academic collaboration, please email cxxgtxy@gmail.com (homepage) with your CV, representative projects, and research interests.