Data Intelligence and Database Systems Lab
School of Computer Science, Beijing Institute of Technology
GitHub · School of Computer Science · Featured Projects
BIT DataLab is a research group at the School of Computer Science, Beijing Institute of Technology.
Our research focuses on data management, database systems, data-centric AI, large language models, data lakes, graph and spatio-temporal data, distributed data processing, and intelligent data analysis.
We build practical systems, open-source tools, datasets, and benchmarks for the research community.
- Data Management and Database Systems
- Data-centric AI and Data Preparation
- Large Language Models for Data Analysis
- Data Lakes, Data Discovery, and Retrieval-Augmented Generation
- Graph, Spatio-temporal, Multimedia, and Uncertain Data
- Distributed Data Processing and Query Optimization
- Data Quality, Cleaning, Integration, and Provenance
|
A framework for converting statistical figures and visual content into editable formats.
View repository → |
A benchmark and research resource for discovering joinable and unionable tables in data lakes.
View repository → |
A comprehensive benchmark for unstructured data analysis with large language models.
View repository → |
- PaperCitedRemarkAnalysis — An end-to-end pipeline for analyzing how influential citations refer to a target paper.
- bench-u-page — Web resources for Bench-U.
- All repositories
We welcome motivated undergraduate students, graduate students, doctoral candidates, research interns, and collaborators who are interested in data management, database systems, data-centric AI, and large language models.
- BIT DataLab on GitHub
- School of Computer Science, Beijing Institute of Technology
- Beijing Institute of Technology
BIT DataLab · School of Computer Science · Beijing Institute of Technology
