Exploring the World and the Life.

My name is Qiaosheng Chen (陈乔晟). Currently, I am a third-year Ph.D. candidate of Websoft Research Group at the School of Computer Science and Technology, Nanjing University. I received my B.Eng. degree from Harbin Institute of Technology at Weihai in 2021. In the same year, I was admitted to study for a M.Sc. degree in Nanjing University without entrance examination. In 2023, I started my Ph.D. degree under the supervision of Prof. Gong Cheng.

My research interests include Web Code Generation, Big Data Search, Knowledge Graph, and Retrieval-Augmented Generation (RAG). I have published 5 CCF-A papers and 3 CCF-B papers as the first/co-first author. I received the ISWC 2023 Best Research Paper Nomination and the CCF BigData 2025 Best Application Paper award. I was selected for the inaugural CAST Young Talent Support Program (Ph.D. Special Plan) sponsored by the Chinese Information Processing Society of China (CIPS).

Feel free to reach out if you are interested in collaboration or potential opportunities.

News

  • 2026.04 One paper accepted by ICML 2026 (first author).
  • 2026.03 Joined Alibaba Qwen Team on WebDev pre-training.
  • 2025.12 One paper accepted by ICLR 2026 (co-first author).
  • 2025.11 Joined Tencent HY AI Data Team for Deep Research Agent project.
  • 2025.07 Two papers accepted by SIGIR 2025.
  • 2025.04 Joined Shanghai AI Lab as a research intern.
  • 2024.12 Received National Scholarship (Ph.D.) and Outstanding Graduate Student Pioneer at NJU.
  • 2024.10 Selected for the inaugural CAST Young Talent Support Program (Ph.D. Special Plan, CIPS).
  • 2024.07 One paper accepted by ISWC 2024 (first author).
  • 2024.03 Two papers accepted by SIGIR 2024 (both first author).

Education

Nanjing University
2021.09 - Present
Ph.D. in Computer Science advised by Prof. Gong Cheng
National Scholarship (Ph.D.) | Outstanding Graduate Student Pioneer
Harbin Institute of Technology at Weihai
2017.09 - 2021.06
B.E. in Computer Science (GPA: 90.11/100, Rank: 5/137)
National Scholarship (Undergraduate) | Outstanding Graduate of Shandong Province

Experience

Qwen Team, Alibaba (Tongyi Lab)
2026.03 - 2026.06
Research Intern · WebDev Pre-training
Pre-training data cleaning, filtering, and synthesis for Qwen's code capabilities; designed file-level and repo-level code data pipelines.
AI Data Team, Tencent TEG (HunYuan)
2025.11 - 2026.02
Research Intern · Deep Research Agent
Designed DAG-based planning and summary strategies; SFT+RL optimization for agent plan capabilities.
Shanghai AI Lab
2025.04 - 2025.09
Research Intern · Interactive Scientific Demo Code Generation & Multimodal Code LLM
Led research on interactive science demo generation (ICML 2026); contributed HTML code data for JanusCoder (ICLR 2026) and Intern-S1-Pro.

Publications

(* equal contribution · † corresponding author)

InteractScience
InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation
Qiaosheng Chen, Yang Liu, Lei Li, Kai Chen, Qipeng Guo, Gong Cheng, Fei Yuan.
Studied LLMs' ability to generate interactive scientific demonstration website code. Proposed hard tests for interactivity and soft tests based on multi-screenshot comparison.
ICML 2026  
JanusCoder
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence
Qiushi Sun*, Jingyang Gong*, Yang Liu*, Qiaosheng Chen*, Lei Li, Kai Chen, Qipeng Guo, Ben Kao, Fei Yuan.
Trained language and multimodal models for code plotting, frontend web generation, multimodal algorithm problems, and niche visualization languages.
ICLR 2026  
HuggingKG
Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph
Qiaosheng Chen, Kaijia Huang, Xiao Zhou, Weiqing Luo, Yuanning Cui, Gong Cheng.
Built HuggingKG, the first AI resource knowledge graph based on Hugging Face (2.6M nodes, 6.2M edges), and designed HuggingBench covering recommendation, classification, and tracing tasks.
SIGIR 2025  
μDS
μDS: Multi-Objective Data Snippet Extraction for Dataset Search
Xiao Zhou, Qiaosheng Chen, Jiageng Chen, Gong Cheng.
Proposed μDS that jointly optimizes compactness, relevance, representativeness, and cohesiveness for data snippet extraction, modeled as a novel combinatorial optimization problem with worst-case approximation guarantees.
SIGIR 2025  
CDS
Enhancing Dataset Search with Compact Data Snippets
Qiaosheng Chen, Jiageng Chen, Xiao Zhou, Gong Cheng.
Proposed CDS, a subgraph-extraction-based method for generating compact, query-relevant data snippets that improve retrieval accuracy and result interpretability for dataset search.
SIGIR 2024  
ACORDAR 2.0
ACORDAR 2.0: A Test Collection for Ad Hoc Dataset Retrieval with Densely Pooled Datasets and Question-Style Queries
Qiaosheng Chen, Weiqing Luo, Zixian Huang, Tengteng Lin, Xiaxia Wang, Ahmet Soylu, Basil Ell, Baifan Zhou, Evgeny Kharlamov, Gong Cheng.
Built ACORDAR 2.0, a content-based dataset retrieval test collection that uses dense retrieval to expand candidate datasets and LLM-based query rewriting to improve evaluation diversity.
SIGIR 2024  
DR2
Dense Re-Ranking with Weak Supervision for RDF Dataset Search
Qiaosheng Chen, Zixian Huang, Zhiyang Zhang, Weiqing Luo, Tengteng Lin, Qing Shi, Gong Cheng.
Proposed DR2 using distant supervision and self-training for generating pseudo-labeled data, with coarse-to-fine training to improve dense retrieval models for dataset search.
ISWC 2023   🏆 Best Research Paper Nomination

Projects

CN-PDS
China Public Data Search System (CN-PDS)
Qiaosheng Chen (Project Leader)
Led the design and development of a national-scale public data search system. Collected, integrated, and indexed datasets from 148 open data portals across 25 provinces in China. Implemented keyword search, faceted search, and result presentation. Innovatively leveraged LLMs for automated metadata integration, high-precision dataset ranking, and relevance explanation.
CCF BigData 2023 · 🏆 CCF BigData 2025 Best Application Paper · DSE 2026  

Awards

  • 2024, Inaugural CAST Young Talent Support Program (Ph.D. Special Plan, sponsored by CIPS, 3226 nationwide)
  • 2024, National Scholarship (Ph.D.), Nanjing University (14 awardees in CS School)
  • 2024, Outstanding Graduate Student Pioneer, Nanjing University (15 awardees in CS School)
  • 2021, Outstanding Graduate of Shandong Province
  • 2019, CCPC National Finals Bronze Medal
  • 2019, ACM-ICPC Asia Regional Contest (Shanghai) Silver Medal
  • 2019, CCPC Xiamen Site Silver Medal
  • 2018, ACM-ICPC Asia Regional Contest (Xuzhou) Silver Medal
  • 2018, National Scholarship (Undergraduate), HIT Weihai (5 awardees in CS School)

Services

  • Reviewer for ICML 2026 (Gold Reviewer), SIGIR 2025-2026, WWW 2026, KDD 2026, CIKM 2024-2026, WSDM 2026.