Skip to content

Commit

Permalink
add nuscenes-qa-mini dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
hosiet committed Jan 19, 2024
1 parent ffdee9c commit f998e20
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 2 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 20 additions & 1 deletion content/dataset/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ sections:
spacing:
padding: ['1.8rem', '0', '0', '0']
- block: markdown
id: aware
content:
title: Acoustic Waveform Respiratory Evaluation (AWARE)
id: aware
subtitle: January 2024
text: |
{{< columns >}}
Expand All @@ -38,6 +38,25 @@ sections:
columns: '2'
spacing:
padding: ['20px', '0', '20px', '0']
- block: markdown
id: nuscenes-qa-mini
content:
title: NuScenes-QA-mini Dataset
subtitle: 'January 2024'
text: |
This dataset is used for multimodal question-answering tasks in autonomous driving scenarios. We created this dataset based on [nuScenes-QA dataset](https://github.com/qiantianwen/NuScenes-QA) for evaluation in our paper [Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI](/publication/2023-mpnp-llm/). The samples are divided into day and night scenes.
![NuScenes-QA-mini Data Sample](2024-nuscenes-qa-mini/nuqa_example.png)
This dataset is built on the [nuScenes](https://www.nuscenes.org/) mini-split, where we obtain the QA pairs from the [original nuScenes-QA dataset](https://github.com/qiantianwen/NuScenes-QA). Each data sample contains **6-view RGB camera captures, a 5D LiDAR point cloud, and a corresponding text QA pair**. The data in the nuScenes-QA dataset is collected from driving scenes in cities of Boston and Singapore with diverse locations, time, and weather conditions.
{{< hr-pittisl >}}
* You may find more details on our [dataset homepage](https://huggingface.co/datasets/KevinNotSmile/nuscenes-qa-mini).
* The source code of generating the dataset can be found [in our GitHub repository](https://github.com/pittisl/mPnP-LLM/tree/main/nuqamini).
* Our [Modality Plug-and-Play](/publication/2023-mpnp-llm/) paper utilizes this dataset.
design:
columns: '2'
spacing:
padding: ['20px', '0', '20px', '0']
# - block: markdown
# content:
# title:
Expand Down
3 changes: 2 additions & 1 deletion content/publication/2023-mpnp-llm/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,8 @@ input modalities, namely RGB camera view and LiDAR point cloud, is shown in the

We use the nuScenes-QA dataset for multimodal visual QA in autonomous driving,
with results from workstation-level desktop platforms with RTX A6000 and mobile platform
of Nvidia Jetson AGX Orin.
of Nvidia Jetson AGX Orin. Our processed dataset is published as the
[NuScenes-QA-mini dataset](/dataset/#nuscenes-qa-mini).

Compared with existing approaches, mPnP-LLM achieves better accuracy under similar costs.

Expand Down

0 comments on commit f998e20

Please sign in to comment.