Supercharge Your 3D Datasets with SynCD: A Practical Guide
Unlock the full potential of your 3D projects with SynCD, the powerful tool for generating high-quality, synthetic datasets. Are you ready to create custom datasets for deformable and rigid objects? This guide provides a step-by-step walkthrough to get you started, even if you're not an expert. We'll cover everything from environment setup to generating prompts with LLMs, so you can tailor datasets to your specific needs.
Download the SynCD Dataset to Get Started Fast
Skip dataset generation initially and dive right in! Download our pre-generated, filtered dataset for immediate use.
Setting Up Your Environment for SynCD: What You Need
First, you'll need a robust setup. A GPU with at least 48GB VRAM is recommended for efficient processing. The base environment setup is described in the SynCD documentation. Make sure you follow those initial setup steps before proceeding.
Method 1: Effortless Deformable Dataset Generation
Want to create training data for objects that bend and flex? Here’s how easy it is:
- Navigate to the dataset directory:
cd dataset
- Run the generation script:
python gen_deformable.py --save_attn_mask --outdir assets/metadata/deformable_data
This command generates a deformable SynCD dataset with attention masks, saving the output in the specified directory. This is a great way to create a dataset for augmented reality.
Method 2: Rigid Datasets: Precision and Control
Generating datasets for rigid objects requires a bit more setup but offers excellent control. Here's how:
- Download Pre-generated Prompts:
wget https://www.cs.cmu.edu/~syncd-project/assets/prompts_objaverse.pt -P assets/generated_prompts/
- Extract Rendering Metadata:
bash assets/unzip.sh assets/metadata/objaverse_rendering/
- Run the Generation Script:
torchrun --nnodes=1 --nproc_per_node=1 --node_rank=0 --master_port=12356 gen_rigid.py --rootdir ./assets/metadata --promptpath assets/generated_prompts/prompts_objaverse.pt --outdir assets/metadata/rigid_data
Objaverse-Guided Rigid Dataset Generation: Advanced Techniques
For even more realistic datasets, leverage Objaverse assets. This method involves re-rendering assets and calculating multi-view correspondence. Here's a breakdown:
- Install PyTorch3D and Dependencies:
pip install objaverse ninja trimesh "git+https://github.com/facebookresearch/pytorch3d.git"
- Download and Extract Objaverse Renderings:
cd assets/metadata/objaverse_rendering
wget https://huggingface.co/datasets/nupurkmr9/objaverse_rendering/resolve/main/archive_1.zip
unzip archive_1.zip
cd ../../../
bash assets/unzip.sh assets/metadata/objaverse_rendering/
- Calculate Multi-View Correspondence:
python gen_corresp.py --download --rendered_path ./assets/metadata/objaverse_rendering --objaverse_path ./assets/metadata/objaverse_assets --outdir ./assets/metadata
Final Dataset Generation Command:
torchrun --nnodes=1 --nproc_per_node=1 --node_rank=0 --master_port=12356 gen_rigid.py --rootdir ./assets/metadata --promptpath assets/generated_prompts/prompts_objaverse.pt --outdir <output-path-to-save-dataset>
Leverage LLMs to Generate Custom Prompts
Want to create datasets tailored to specific object categories? SynCD lets you generate prompts using Large Language Models (LLMs).
- Download Background Descriptions:
wget https://huggingface.co/datasets/tiange/Cap3D/resolve/main/misc/Cap3D_automated_Objaverse_old.csv?download=true -O Cap3D_automated_Objaverse_old.csv
- Run the Prompt Generation Script:
python gen_prompts.py --rigid --captions Cap3D_automated_Objaverse_old.csv
Next Steps: Unleash the Power of Synthetic Data Generation
You're now equipped to create customized 3D datasets using SynCD. Experiment with different settings, explore the documentation further, and unlock the potential of synthetic data generation for your projects. By using SynCD, you can fine-tune your training datasets to achieve better, cleaner results.