Skip to content
@BIT-DataLab

BIT-DataLab

BIT-DataLab: Official repository of the Data Lab at the School of Computer Science, Beijing Institute of Technology. Led by Prof. Chengliang Chai.

BIT DataLab Logo

BIT DataLab

Data Intelligence and Database Systems Lab
School of Computer Science, Beijing Institute of Technology

BIT DataLab GitHub Stars BIT DataLab GitHub Followers

GitHub · School of Computer Science · Featured Projects

About Us

BIT DataLab is a research group at the School of Computer Science, Beijing Institute of Technology.
Our research focuses on data management, database systems, data-centric AI, large language models, data lakes, graph and spatio-temporal data, distributed data processing, and intelligent data analysis.

We build practical systems, open-source tools, datasets, and benchmarks for the research community.

Research Areas

  • Data Management and Database Systems
  • Data-centric AI and Data Preparation
  • Large Language Models for Data Analysis
  • Data Lakes, Data Discovery, and Retrieval-Augmented Generation
  • Graph, Spatio-temporal, Multimedia, and Uncertain Data
  • Distributed Data Processing and Query Optimization
  • Data Quality, Cleaning, Integration, and Provenance

Featured Projects

A framework for converting statistical figures and visual content into editable formats.

View repository →
A benchmark and research resource for discovering joinable and unionable tables in data lakes.

View repository →
A comprehensive benchmark for unstructured data analysis with large language models.

View repository →

More Open-source Work

Join Us

We welcome motivated undergraduate students, graduate students, doctoral candidates, research interns, and collaborators who are interested in data management, database systems, data-centric AI, and large language models.

Links

BIT DataLab · School of Computer Science · Beijing Institute of Technology

Popular repositories Loading

  1. Edit-Banana Edit-Banana Public

    Edit Banana: A framework for converting statistical formats into editable.

    Python 5.4k 363

  2. LakeBench LakeBench Public

    Python 1.3k 36

  3. Bench-U Bench-U Public

    Bench-U : Unstructured Data Analysis using LLMs: A Comprehensive Benchmark [Experiments & Analysis]

    Python 7 5

  4. PaperCitedRemarkAnalysis PaperCitedRemarkAnalysis Public

    End-to-end pipeline to analyze how influential citations (Fellow authors) refer to a target paper.

    Python 3 1

  5. bench-u-page bench-u-page Public

    TypeScript

  6. .github .github Public

    Official GitHub organization profile for BIT DataLab, School of Computer Science, Beijing Institute of Technology.

Repositories

Showing 6 of 6 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…